# **Analyzing Drug Dispensation Trends in Calgary and Edmonton from 2010 to 2018**

![brain-3829057_960_720](brain-3829057_960_720.jpg)


## Overview of the data set

In [9]:
select *
from Dispensation.csv

Unnamed: 0,city,year,sex,age,drug_type,dispendation_rate,total_dispensation,unique_dispensation,total_population,standard_error,standard_score,alberta_rate
0,Calgary,2010,F,0-4,ALPRAZOLAM,0.000000,0,0,43761.781370,0.045698,0.000000,0.000000
1,Calgary,2010,F,0-4,AMITRIPTYLINE,0.000000,0,0,43761.781370,0.045698,-0.367835,0.016809
2,Calgary,2010,F,0-4,BROMAZEPAM,0.000000,0,0,43761.781370,0.045698,0.000000,0.000000
3,Calgary,2010,F,0-4,BUPROPION,0.000000,0,0,43761.781370,0.045698,0.000000,0.000000
4,Calgary,2010,F,0-4,BUSPIRONE,0.000000,0,0,43761.781370,0.045698,0.000000,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...
20452,Edmonton,2018,M,90+,SERTRALINE,23.467394,1159,62,2641.963567,2.980362,0.055434,23.302181
20453,Edmonton,2018,M,90+,TRANYLCYPROMINE,0.000000,0,0,2641.963567,0.755868,0.000000,0.000000
20454,Edmonton,2018,M,90+,TRAZODONE,90.463019,4060,239,2641.963567,5.851566,4.173140,66.043614
20455,Edmonton,2018,M,90+,VENLAFAXINE,15.518761,797,41,2641.963567,2.423623,-0.229411,16.074766


## What are all the names of drug types and how many are there?

In [78]:
select drug_type, count(DISTINCT drug_type) AS TOTAL --Count all 
From Dispensation.csv
group by drug_type


Unnamed: 0,drug_type,TOTAL
0,ALPRAZOLAM,1
1,AMITRIPTYLINE,1
2,BROMAZEPAM,1
3,BUPROPION,1
4,BUSPIRONE,1
5,CHLORDIAZEPOXIDE,1
6,CITALOPRAM,1
7,CLOBAZAM,1
8,CLOMIPRAMINE,1
9,CLORAZEPATE POTASSIUM,1


## We want to find which city has more people getting prescribed drugs

In [75]:
select city, ROUND(SUM(dispendation_rate),0) as total
from Dispensation.csv
group by city
order by total 




Unnamed: 0,city,total
0,Calgary,67505.0
1,Edmonton,81699.0


## We also want to find which drug is most prescribed between the 2 cities

In [46]:
SELECT drug_type, SUM(total_dispensation)as total_dispensation
FROM Dispensation.csv
WHERE city IN ('Edmonton', 'Calgary')
GROUP BY drug_type
ORDER BY total_dispensation DESC







Unnamed: 0,drug_type,total_dispensation
0,CITALOPRAM,3288945.0
1,VENLAFAXINE,3211742.0
2,LORAZEPAM,2588397.0
3,ESCITALOPRAM,2319382.0
4,TRAZODONE,2275491.0
5,BUPROPION,1813346.0
6,SERTRALINE,1662862.0
7,DULOXETINE,1401171.0
8,MIRTAZAPINE,1394028.0
9,AMITRIPTYLINE,1234669.0


## Let's see how many of each of the drugs are prescribed by the years

In [68]:
SELECT d.drug_type, y.year, SUM(d.total_dispensation) as total_dispensation
FROM Dispensation.csv as d
JOIN (
    SELECT DISTINCT year -- Add subquery to enable the distinct year
    FROM Dispensation.csv)
	y ON d.year = y.year
GROUP BY d.drug_type, y.year
ORDER BY y.year ASC, total_dispensation DESC




Unnamed: 0,drug_type,year,total_dispensation
0,CITALOPRAM,2010,341800.0
1,VENLAFAXINE,2010,325467.0
2,LORAZEPAM,2010,214877.0
3,TRAZODONE,2010,138972.0
4,BUPROPION,2010,134408.0
...,...,...,...
305,DESIPRAMINE,2018,1869.0
306,PHENELZINE,2018,702.0
307,VILAZODONE,2018,246.0
308,CLORAZEPATE POTASSIUM,2018,0.0


## We can narrow it down, and Find the most prescribed drug by year

In [22]:
SELECT drug_type, year, total_dispensation
FROM (
    SELECT drug_type, year, SUM(total_dispensation) as total_dispensation, 
           ROW_NUMBER() OVER (PARTITION BY year ORDER BY SUM(total_dispensation) DESC) as rank  --Using a window function to assign rank to the row with the highest sum
    FROM Dispensation.csv
    GROUP BY drug_type, year) 
WHERE rank = 1 --Showing the top most prescribed drug
ORDER BY year 



Unnamed: 0,drug_type,year,total_dispensation
0,CITALOPRAM,2010,341800.0
1,CITALOPRAM,2011,409436.0
2,CITALOPRAM,2012,371338.0
3,CITALOPRAM,2013,380224.0
4,CITALOPRAM,2014,345325.0
5,CITALOPRAM,2015,361362.0
6,VENLAFAXINE,2016,382665.0
7,ESCITALOPRAM,2017,409142.0
8,ESCITALOPRAM,2018,432151.0


## Now lets look at the age groups, let's find out how many despositions there were by age group

In [15]:
SELECT age, sum(total_dispensation) as total
FROM Dispensation.csv
where total_dispensation > 1
group by age
order by total

Unnamed: 0,age,total
0,0-4,26003.0
1,5-9,65078.0
2,10-14,208655.0
3,15-19,551586.0
4,20-24,845166.0
5,90+,971669.0
6,75-79,1033736.0
7,70-74,1082372.0
8,85-89,1112926.0
9,80-84,1165337.0


## Here we can see the type of drug most prescribed by age group

In [13]:
SELECT age, drug_type, total_dispensation
FROM (
  SELECT age, drug_type, SUM(total_dispensation) AS total_dispensation,
         ROW_NUMBER() OVER (PARTITION BY age ORDER BY SUM(total_dispensation) DESC) AS rn
  FROM Dispensation.csv
  WHERE total_dispensation > 1
  GROUP BY age, drug_type
) t
WHERE rn = 1
ORDER BY total_dispensation desc

Unnamed: 0,age,drug_type,total_dispensation
0,50-54,VENLAFAXINE,431631.0
1,55-59,VENLAFAXINE,401552.0
2,45-49,VENLAFAXINE,340780.0
3,60-64,VENLAFAXINE,304238.0
4,85-89,CITALOPRAM,282880.0
5,40-44,VENLAFAXINE,280140.0
6,90+,CITALOPRAM,278565.0
7,80-84,CITALOPRAM,261651.0
8,35-39,VENLAFAXINE,249148.0
9,30-34,ESCITALOPRAM,217081.0


## Now lets find out which drug is the most prescribed by age group

In [116]:
WITH total_dispensation_by_age AS (
    SELECT age, SUM(total_dispensation) AS total
    FROM Dispensation.csv
    WHERE total_dispensation > 1
    GROUP BY age
)

SELECT d.age, d.drug_type, SUM(d.total_dispensation) AS total_dispensation
FROM Dispensation.csv AS d
JOIN total_dispensation_by_age AS t 
ON d.age = t.age
WHERE d.total_dispensation > 1
GROUP BY d.age, d.drug_type
HAVING SUM(d.total_dispensation) = (SELECT MAX(total) FROM total_dispensation_by_age WHERE age = d.age)
ORDER BY d.age, total_dispensation DESC;


Unnamed: 0,age,drug_type,total_dispensation


## Lets look into whether males or females are prescribed more

In [125]:
select sex, sum(total_dispensation) as total
FROM Dispensation.csv
group by sex


Unnamed: 0,sex,total
0,F,16848404.0
1,M,9633930.0


Now lets calculate the difference

In [None]:
SELECT COUNT(total_dispensation_by_sex) AS TOTAL,sex, SUM(total) as total_dispensation
FROM (
    SELECT sex, total_dispensation as total
    FROM Dispensation.csv
    WHERE sex = 'M'
    UNION ALL
FROM
	(SELECT sex, total_dispensation as total
    FROM Dispensation.csv
    WHERE sex = 'F'
) as total_dispensation_by_sex
GROUP BY sex

Error: Parser Error: syntax error at end of input