# Analyzing Data Using Python
Reading and analyzing large chunks of data can be difficult without computational methods. This Python notebook will teach you how to use Pandas to read, filter, concatenate, and export data.
### Step 1 - Importing Pandas
Import pandas and create an data frame object using method `read_csv`. The method takes a file name followed by its extention. The csv file used for this example is a spreadsheet of county health data. The file is placed in the same directory as this Python notebook.

In [1]:
import pandas as pd

df = pd.read_csv("CountyHealthData_2014-2015.csv")

### Step 2 - Calling a "Series"
A series is like a column in a table. To narrow the data to one or more series, nest them in square brackets. The inside brackets hold a list of column names.

The following shows the data filtered with only two columns: unemployment and income inequality.

In [10]:
df[["Unemployment", "Income inequality"]]

Unnamed: 0,Unemployment,Income inequality
0,0.091,
1,0.088,3.907
2,0.054,
3,0.050,3.806
4,0.152,
...,...,...
6104,0.047,4.630
6105,0.054,
6106,0.050,4.157
6107,0.051,


Two more examples:

In [2]:
df[["FIPS", "Air pollution - particulate matter"]]

Unnamed: 0,FIPS,Air pollution - particulate matter
0,2016,
1,2016,
2,2020,
3,2020,
4,2050,
...,...,...
6104,56041,11.61
6105,56043,10.04
6106,56043,10.04
6107,56045,10.71


In [20]:
df[["State", "Population that is not proficient in English"]]

Unnamed: 0,State,Population that is not proficient in English
0,AK,0.078
1,AK,0.080
2,AK,0.023
3,AK,0.023
4,AK,0.046
...,...,...
6104,WY,0.009
6105,WY,0.015
6106,WY,0.015
6107,WY,0.002


### Step 3 - Concatenate Subsets
Concatenate these three subsets into one subset using `pd.concat`. This method takes an array of the three data frames as its first argument.

In [3]:
series0 = pd.DataFrame(df[["Unemployment", "Income inequality"]])
series1 = pd.DataFrame(df[["FIPS", "Air pollution - particulate matter"]])
series2 = pd.DataFrame(df[["State", "Population that is not proficient in English"]])

final = pd.concat([series0, series1, series2], axis=1, ignore_index=False, sort=False)
final

Unnamed: 0,Unemployment,Income inequality,FIPS,Air pollution - particulate matter,State,Population that is not proficient in English
0,0.091,,2016,,AK,0.078
1,0.088,3.907,2016,,AK,0.080
2,0.054,,2020,,AK,0.023
3,0.050,3.806,2020,,AK,0.023
4,0.152,,2050,,AK,0.046
...,...,...,...,...,...,...
6104,0.047,4.630,56041,11.61,WY,0.009
6105,0.054,,56043,10.04,WY,0.015
6106,0.050,4.157,56043,10.04,WY,0.015
6107,0.051,,56045,10.71,WY,0.002


### Step 4 - Exporting a New Subset
Use the method `to_csv()` on the data frame to export the new subset to a csv file. Add an optional parameter `index=False` to eliminate the column of indexes.

In [4]:
final.to_csv("CompiledData.csv", index=False)