#### Write this as a brief summary of your interests and intent, including:

* The kind of data you'd like to work with/field you're interested in (e.g., geodata, weather data, etc.)

* The kinds of questions you'll be asking of that data

* Possible source for such data

In other words, write down what kind of data you plan to work with, and what kinds of questions you'd like to ask of it. This constitutes your Project Proposal/Outline, and should look something like this:

> Our project is to uncover patterns in criminal activity around Los Angeles. We'll examine relationships between types of crime and location; crime rates and times of day; trends in crime rates over the course of the year; and related questions, as the data admits.

#### Finding Data

Once your group has written an outline, it's time to start hunting for data. You are free to use data from any source, but we recommend the following curated sources of high-quality data:

* [data.world](https://data.world/)

* [Kaggle](https://www.kaggle.com/)

* [Data.gov](https://www.data.gov)

* [Public APIs](https://github.com/abhishekbanthia/Public-APIs)

* [Awesome-APIs List](https://github.com/Kikobeats/awesome-api)

* [Medium APIs List](https://medium.com/@benjamin_libor/a-curated-collection-of-over-150-apis-to-build-great-products-fdcfa0f361bc)

Chances are you'll have to update your Project Outline as you explore the available data. **This is fine**—adjustments like this are part of the process! Just make sure everyone in the group is up-to-speed on the goals of the project as you make changes.

Make sure that your data is not too large for local analysis. **Big Data** datasets are difficult to manage locally, so consider a subset of that data or a different dataset altogether.

#### Data Cleanup & Analysis

With data in hand, it's time to tackle development and analysis. This is where the fun starts!

Inevitably, the analysis process can be broken into two broad phases: **Exploration & Cleanup** and **Analysis** proper.

As you've learned, you'll need to explore, clean, and reformat your data before you can begin to answer your research questions. We recommend keeping track of these exploration and cleanup steps in a dedicated Jupyter Notebook, both for organization's sake and to make it easier to  present your work later.

Similarly, after you've massaged your data and are ready to start crunching numbers, you should keep track of your work in a Jupyter Notebook dedicated specifically to analysis.

During both phases, **don't forget to include plots**! Don't make the mistake of waiting to build figures until you're preparing your presentation. Creating them along the way can reveal insights and interesting trends in the data that you might not notice otherwise.

We recommend focusing your analysis on techniques such as aggregation, correlation, comparison, summary statistics, sentiment analysis, and time series analysis.

Finally, be sure that your projects meet the [technical requirements](TechnicalRequirements.md).


### Import our dependicies 

In [16]:
import pandas as pd 
import matplotlib.pyplot as plt
import numpy as np
csv_pathSO = "overdose_processed_states.csv"
StateOverdose_df = pd.read_csv(csv_pathSO)

StateOverdose_df.head(10)

Unnamed: 0,State,Year,Deaths,Death_Rate,Pct_of_Total_Deaths,Multiple_Cause_of_death,log_of_Deaths,log_of_Pct_of_Deaths,log_of_Death_Rate
0,Wyoming,2000,,,,Heroin,,,
1,Wyoming,2000,,,,Other opioids,,,
2,Wyoming,2000,,,,Methadone,,,
3,Wyoming,2000,,,,Other synthetic narcotics,,,
4,Wyoming,2000,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
5,Wyoming,2001,,,,Heroin,,,
6,Wyoming,2001,,,,Other opioids,,,
7,Wyoming,2001,,,,Methadone,,,
8,Wyoming,2001,,,,Other synthetic narcotics,,,
9,Wyoming,2001,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0


### Import data
    Import all data sets needed to analyze our hypothesis

In [17]:
StateOverdose_df = StateOverdose_df.dropna(how="any")
StateOverdose_df.head(10)

Unnamed: 0,State,Year,Deaths,Death_Rate,Pct_of_Total_Deaths,Multiple_Cause_of_death,log_of_Deaths,log_of_Pct_of_Deaths,log_of_Death_Rate
4,Wyoming,2000,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
9,Wyoming,2001,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
14,Wyoming,2002,10.0,0.0,0.0,All Opioids,2.302585,0.0,0.0
19,Wyoming,2003,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
24,Wyoming,2004,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
29,Wyoming,2005,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
34,Wyoming,2006,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
39,Wyoming,2007,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
41,Wyoming,2008,24.0,5.8,0.0,Other opioids,3.178054,0.0,1.757858
44,Wyoming,2008,35.0,5.8,0.0,All Opioids,3.555348,0.0,1.757858


In [18]:
filter_data = StateOverdose_df [StateOverdose_df["Multiple_Cause_of_death"] =="All Opioids"]
filter_data

Unnamed: 0,State,Year,Deaths,Death_Rate,Pct_of_Total_Deaths,Multiple_Cause_of_death,log_of_Deaths,log_of_Pct_of_Deaths,log_of_Death_Rate
4,Wyoming,2000,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
9,Wyoming,2001,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
14,Wyoming,2002,10.0,0.0,0.0,All Opioids,2.302585,0.0,0.0
19,Wyoming,2003,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
24,Wyoming,2004,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
29,Wyoming,2005,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
34,Wyoming,2006,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
39,Wyoming,2007,0.0,0.0,0.0,All Opioids,0.0,0.0,0.0
44,Wyoming,2008,35.0,5.8,0.0,All Opioids,3.555348,0.0,1.757858
49,Wyoming,2009,27.0,0.0,0.0,All Opioids,3.295837,0.0,0.0


### Clean Data
    -Merge tables with similar data
    -Rename columns
    -Delete duplicates


In [19]:
Grouped = filter_data.groupby(["State", "Year"])

pd.set_option('display.max_rows', None)
Grouped.first()

Unnamed: 0_level_0,Unnamed: 1_level_0,Deaths,Death_Rate,Pct_of_Total_Deaths,Multiple_Cause_of_death,log_of_Deaths,log_of_Pct_of_Deaths,log_of_Death_Rate
State,Year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Alabama,2000,51.0,0.8,0.0,All Opioids,3.931826,0.0,-0.223144
Alabama,2001,55.0,1.6,0.0,All Opioids,4.007333,0.0,0.470004
Alabama,2002,66.0,2.0,0.0,All Opioids,4.189655,0.0,0.693147
Alabama,2003,50.0,1.5,0.0,All Opioids,3.912023,0.0,0.405465
Alabama,2004,100.0,2.6,0.0,All Opioids,4.60517,0.0,0.955511
Alabama,2005,85.0,2.1,0.0,All Opioids,4.442651,0.0,0.741937
Alabama,2006,143.0,4.1,0.0,All Opioids,4.962845,0.0,1.410987
Alabama,2007,174.0,5.0,0.0,All Opioids,5.159055,0.0,1.609438
Alabama,2008,202.0,5.2,0.0,All Opioids,5.308268,0.0,1.648659
Alabama,2009,225.0,5.9,0.0,All Opioids,5.4161,0.0,1.774952


In [20]:
pivottable = pd.pivot_table(filter_data, values='Deaths', index=['State'], columns=['Year'], aggfunc=np.sum)
pivottable

Year,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
Alabama,51.0,55.0,66.0,50.0,100.0,85.0,143.0,174.0,202.0,225.0,197.0,192.0,178.0,187.0,315.0,315.0
Alaska,21.0,10.0,0.0,10.0,0.0,0.0,25.0,0.0,96.0,105.0,83.0,68.0,74.0,74.0,94.0,115.0
Arizona,144.0,156.0,244.0,294.0,285.0,333.0,391.0,408.0,469.0,560.0,604.0,550.0,544.0,523.0,628.0,716.0
Arkansas,0.0,14.0,100.0,121.0,151.0,144.0,170.0,179.0,256.0,273.0,234.0,206.0,191.0,195.0,182.0,229.0
California,1138.0,649.0,1602.0,1537.0,1487.0,1446.0,1595.0,1730.0,1979.0,2228.0,2141.0,2179.0,1924.0,2169.0,2250.0,2233.0
Colorado,141.0,161.0,161.0,152.0,168.0,242.0,256.0,337.0,315.0,372.0,301.0,415.0,414.0,451.0,555.0,531.0
Connecticut,129.0,133.0,137.0,150.0,176.0,158.0,204.0,232.0,210.0,195.0,185.0,177.0,174.0,476.0,602.0,861.0
Delaware,0.0,10.0,38.0,30.0,15.0,22.0,46.0,46.0,50.0,91.0,114.0,109.0,86.0,118.0,145.0,156.0
District of Columbia,0.0,0.0,0.0,10.0,19.0,10.0,30.0,0.0,0.0,0.0,26.0,49.0,39.0,45.0,81.0,116.0
Florida,553.0,907.0,990.0,1110.0,1318.0,1211.0,1394.0,1576.0,1579.0,1790.0,1849.0,1762.0,1504.0,1448.0,1657.0,2209.0


In [23]:
csv_pathML = "states_and_dates.csv"
MarijuanaLegal_df = pd.read_csv(csv_pathML) 

MarijuanaLegal_df

Unnamed: 0,State,current status,medical Year,Recreation Year,2012,2013,2014,2015,2016,2017,2018,2019
0,Alabama,Illegal,,,False,False,False,False,False,False,False,False
1,Alaska,Recreational,1998.0,2014.0,False,False,True,True,True,True,True,True
2,Arizona,Medical,2010.0,,False,False,False,False,False,False,False,False
3,Arakansas,Medical,2016.0,,False,False,False,False,False,False,False,False
4,California,Recreational,1996.0,2016.0,False,False,False,False,True,True,True,True
5,Colorado,Recreational,2000.0,2012.0,True,True,True,True,True,True,True,True
6,Connecticut,Medical,2012.0,,False,False,False,False,False,False,False,False
7,Delaware,Medical,2011.0,,False,False,False,False,False,False,False,False
8,Florida,Medical,2016.0,,False,False,False,False,False,False,False,False
9,Georgia,Illegal,,,False,False,False,False,False,False,False,False


### Analysis Question One:
Which drugs are considered opioids, and which have the highest mortality rate? Create a data frame that lists all drugs that are recognized as opioids in the US. Create a bar chart to visualize which opioids have the highest mortality rates.

### Analysis Question Two:
Which states have the highest opioid drug abuse? Create a dataframe showing which states have the highest deaths per capita due to drug abuse. Graph this data. Create a "heat" map of the US to illustrate which states have the highest usage.

### Analysis Question Three:
Is there a difference with medical marijuana legalization and recreational marijuana legalization 
