# Example 1 - Data Manipulation

## Introductions

Performing any in-depth analysis is challenging. So over the years our team has introduced different teaching styles to smoothen the learning curve. This series of notebooks will help guide your team through the fundamentals of data science and will show you a prime example of a winning analysis. We highly encourage everyone to read through the examples and complete the workshop.

This year, we are going with a Wikipedia-style tutorial series where our tutorials for the week will be split into two different onebooks; one for the example analysis, and another for the workshops. Throughout the example, hyperlinks will connect the two workshops and will allow you to jump from notebook to notebook. These hyperlinks serve just like the ones on Wikipedia, they help round your knowledge if you come accross an unfamiliar concept in the examples. We found that students learnt best by supporting new concepts explained in the workshops with an actual example in an analysis.

Nonetheless, go through the material in whatever order you like! Don't feel scared to complete the workshops then view the Examples. Remember to ask for help on Slack if we didn't do a good enough job at explaining!

Best wishes,<br>
The STEM Fellowship Data Science Team

## Step 1: Looking At Data

The first step in any analysis is to look for a good dataset. You generally want datasets that aren't missing values, contains many descriptive variables, and pertains to a topic that could a candidate for an analysis. Our team has viewed dozens of datasets and struggled to select the best one for this analysis so don't get discouraged! We didn't know what our analysis should be so we brainstormed topics, gathered more datasets, and rejected ideas. This proccess takes a while but once your team has a direction you'll do fine.

Our team has decided to look at the Air Quality dataset provided by the EPA (United States Environmental Protection Agency). When finding data make sure to look for a README for context on the column headers.

 - The Dataset: https://aqs.epa.gov/aqsweb/airdata/download_files.html#main-content
 - README: https://aqs.epa.gov/aqsweb/airdata/FileFormats.html

Since the dataset contains 11 million rows we've provided a minimized version for those with less powerful computers. For benchmarks, it could run on a 7-year-old computer with 8GB of ram.

In [1]:
import pandas as pd

In [2]:
data1 = pd.read_csv("../datasets/annual_aqi_by_cbsa_2018.csv")
data2 = pd.read_csv("../datasets/annual_aqi_by_county_2018.csv")
data3 = pd.read_csv("../datasets/annual_conc_by_monitor_2018.csv")

In [3]:
data1.head()

Unnamed: 0,CBSA,CBSA Code,Year,Days with AQI,Good Days,Moderate Days,Unhealthy for Sensitive Groups Days,Unhealthy Days,Very Unhealthy Days,Hazardous Days,Max AQI,90th Percentile AQI,Median AQI,Days CO,Days NO2,Days Ozone,Days SO2,Days PM2.5,Days PM10
0,"Aberdeen, WA",10140,2018,31,30,1,0,0,0,0,57,38,26,0,0,0,0,31,0
1,"Akron, OH",10420,2018,90,71,19,0,0,0,0,79,60,37,0,0,27,0,63,0
2,"Albany, GA",10500,2018,59,44,15,0,0,0,0,79,64,29,0,0,0,0,59,0
3,"Albany-Schenectady-Troy, NY",10580,2018,90,69,21,0,0,0,0,78,60,40,0,0,52,0,38,0
4,"Alexandria, LA",10780,2018,28,26,2,0,0,0,0,54,49,27,0,0,0,0,28,0


In [4]:
data2.head()

Unnamed: 0,State,County,Year,Days with AQI,Good Days,Moderate Days,Unhealthy for Sensitive Groups Days,Unhealthy Days,Very Unhealthy Days,Hazardous Days,Max AQI,90th Percentile AQI,Median AQI,Days CO,Days NO2,Days Ozone,Days SO2,Days PM2.5,Days PM10
0,Alabama,DeKalb,2018,58,58,0,0,0,0,0,44,37,31,0,0,58,0,0,0
1,Alabama,Etowah,2018,59,27,31,0,1,0,0,153,62,52,0,0,0,0,59,0
2,Alabama,Jefferson,2018,59,38,21,0,0,0,0,72,57,47,0,1,6,7,44,1
3,Alabama,Mobile,2018,59,55,4,0,0,0,0,53,38,17,0,0,0,20,39,0
4,Alabama,Morgan,2018,59,54,5,0,0,0,0,56,50,29,0,0,0,0,59,0


In [5]:
data3.head()

Unnamed: 0,State Code,County Code,Site Num,Parameter Code,POC,Latitude,Longitude,Datum,Parameter Name,Sample Duration,...,75th Percentile,50th Percentile,10th Percentile,Local Site Name,Address,State Name,County Name,City Name,CBSA Name,Date of Last Change
0,1,55,10,88502,3,33.991494,-85.992647,NAD83,Acceptable PM2.5 AQI & Speciation Mass,1 HOUR,...,16.1,12.0,6.3,GADSDEN C. COLLEGE,"1001 WALLACE DRIVE, GADSDEN, AL 35902",Alabama,Etowah,Gadsden,"Gadsden, AL",2018-05-09
1,1,55,10,88502,3,33.991494,-85.992647,NAD83,Acceptable PM2.5 AQI & Speciation Mass,24-HR BLK AVG,...,15.5,12.7,7.3,GADSDEN C. COLLEGE,"1001 WALLACE DRIVE, GADSDEN, AL 35902",Alabama,Etowah,Gadsden,"Gadsden, AL",2018-05-09
2,1,73,23,42401,2,33.553056,-86.815,WGS84,Sulfur dioxide,1 HOUR,...,2.8,1.7,0.2,North Birmingham,"NO. B'HAM,SOU R.R., 3009 28TH ST. NO.",Alabama,Jefferson,Birmingham,"Birmingham-Hoover, AL",2018-04-12
3,1,73,23,42401,2,33.553056,-86.815,WGS84,Sulfur dioxide,1 HOUR,...,0.7,0.3,0.0,North Birmingham,"NO. B'HAM,SOU R.R., 3009 28TH ST. NO.",Alabama,Jefferson,Birmingham,"Birmingham-Hoover, AL",2018-04-12
4,1,73,23,42401,2,33.553056,-86.815,WGS84,Sulfur dioxide,24-HR BLK AVG,...,0.9,0.5,0.1,North Birmingham,"NO. B'HAM,SOU R.R., 3009 28TH ST. NO.",Alabama,Jefferson,Birmingham,"Birmingham-Hoover, AL",2018-04-12


In [6]:
print(data1.columns)
print(data2.columns)
print(data3.columns)
#I googled around and found what a cbsa is https://en.wikipedia.org/wiki/Core-based_statistical_area

Index(['CBSA', 'CBSA Code', 'Year', 'Days with AQI', 'Good Days',
       'Moderate Days', 'Unhealthy for Sensitive Groups Days',
       'Unhealthy Days', 'Very Unhealthy Days', 'Hazardous Days', 'Max AQI',
       '90th Percentile AQI', 'Median AQI', 'Days CO', 'Days NO2',
       'Days Ozone', 'Days SO2', 'Days PM2.5', 'Days PM10'],
      dtype='object')
Index(['State', 'County', 'Year', 'Days with AQI', 'Good Days',
       'Moderate Days', 'Unhealthy for Sensitive Groups Days',
       'Unhealthy Days', 'Very Unhealthy Days', 'Hazardous Days', 'Max AQI',
       '90th Percentile AQI', 'Median AQI', 'Days CO', 'Days NO2',
       'Days Ozone', 'Days SO2', 'Days PM2.5', 'Days PM10'],
      dtype='object')
Index(['State Code', 'County Code', 'Site Num', 'Parameter Code', 'POC',
       'Latitude', 'Longitude', 'Datum', 'Parameter Name', 'Sample Duration',
       'Pollutant Standard', 'Metric Used', 'Method Name', 'Year',
       'Units of Measure', 'Event Type', 'Observation Count',
       'Ob

In [7]:
# I only want the lat lon and Datum
data3_temp = data3[["Latitude","Longitude","Datum"]]
data3_temp.head(2)

Unnamed: 0,Latitude,Longitude,Datum
0,33.991494,-85.992647,NAD83
1,33.991494,-85.992647,NAD83


In [8]:
# I want to that one piece of data
data3_temp["Latitude"][3]

33.553055999999998

### Attribute selection
There are two paradigm for attribute selection,
iloc for Index based selection, loc for label based selection

In [9]:
# Index based selection
data3.iloc[1]

State Code                                                            1
County Code                                                          55
Site Num                                                             10
Parameter Code                                                    88502
POC                                                                   3
Latitude                                                        33.9915
Longitude                                                      -85.9926
Datum                                                             NAD83
Parameter Name                   Acceptable PM2.5 AQI & Speciation Mass
Sample Duration                                           24-HR BLK AVG
Pollutant Standard                                                  NaN
Metric Used                                             Observed Values
Method Name                                                         NaN
Year                                                            

Both loc ,iloc are row-first, column-second. 
This means that it's marginally easier to retrieve rows, 
and marginally harder to get retrieve columns. 
To get a column with iloc, we can do the following:

In [10]:
data3_temp.iloc[:, 0].head(2)
# the operator : come from  native python, it means everything

0    33.991494
1    33.991494
Name: Latitude, dtype: float64

We can also combine this operator, to select data in a range
For example we want to select first, second and third entries

In [11]:
data3_temp.iloc[:3, 0]

0    33.991494
1    33.991494
2    33.553056
Name: Latitude, dtype: float64

You can also pass in a list

In [12]:
data3_temp.iloc[[1,2,3],0]

1    33.991494
2    33.553056
3    33.553056
Name: Latitude, dtype: float64

You can also make selection with negative number,
It will count from the end
For example if you want last 3 elements

In [13]:
data3_temp.iloc[-5:]

Unnamed: 0,Latitude,Longitude,Datum
14812,32.633671,-115.504995,WGS84
14813,32.466389,-114.768611,WGS84
14814,32.466389,-114.768611,WGS84
14815,32.466389,-114.768611,WGS84
14816,32.466389,-114.768611,WGS84


### Label based selection (loc)
It is similar to Index based selection
For example if i want first entry on country

In [17]:
data3.loc[0, 'Year']

2018

### Manipulating the index
We can manipulate the index in entry as we want it
Here is what happen if i set 'Country Code' As the index

In [21]:
data3.set_index("County Code").head()

Unnamed: 0_level_0,State Code,Site Num,Parameter Code,POC,Latitude,Longitude,Datum,Parameter Name,Sample Duration,Pollutant Standard,...,75th Percentile,50th Percentile,10th Percentile,Local Site Name,Address,State Name,County Name,City Name,CBSA Name,Date of Last Change
County Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
55,1,10,88502,3,33.991494,-85.992647,NAD83,Acceptable PM2.5 AQI & Speciation Mass,1 HOUR,,...,16.1,12.0,6.3,GADSDEN C. COLLEGE,"1001 WALLACE DRIVE, GADSDEN, AL 35902",Alabama,Etowah,Gadsden,"Gadsden, AL",2018-05-09
55,1,10,88502,3,33.991494,-85.992647,NAD83,Acceptable PM2.5 AQI & Speciation Mass,24-HR BLK AVG,,...,15.5,12.7,7.3,GADSDEN C. COLLEGE,"1001 WALLACE DRIVE, GADSDEN, AL 35902",Alabama,Etowah,Gadsden,"Gadsden, AL",2018-05-09
73,1,23,42401,2,33.553056,-86.815,WGS84,Sulfur dioxide,1 HOUR,SO2 1-hour 2010,...,2.8,1.7,0.2,North Birmingham,"NO. B'HAM,SOU R.R., 3009 28TH ST. NO.",Alabama,Jefferson,Birmingham,"Birmingham-Hoover, AL",2018-04-12
73,1,23,42401,2,33.553056,-86.815,WGS84,Sulfur dioxide,1 HOUR,SO2 Annual 1971,...,0.7,0.3,0.0,North Birmingham,"NO. B'HAM,SOU R.R., 3009 28TH ST. NO.",Alabama,Jefferson,Birmingham,"Birmingham-Hoover, AL",2018-04-12
73,1,23,42401,2,33.553056,-86.815,WGS84,Sulfur dioxide,24-HR BLK AVG,SO2 24-hour 1971,...,0.9,0.5,0.1,North Birmingham,"NO. B'HAM,SOU R.R., 3009 28TH ST. NO.",Alabama,Jefferson,Birmingham,"Birmingham-Hoover, AL",2018-04-12


Performing a set_index when you can come up with an index for the dataset which is better than the current one.

### Conditional Selection
Sometimes we want to select data based on some conditions
Check what happen if i do this


In [27]:
data2.State == "Alabama"

0       True
1       True
2       True
3       True
4       True
5       True
6       True
7       True
8       True
9      False
10     False
11     False
12     False
13     False
14     False
15     False
16     False
17     False
18     False
19     False
20     False
21     False
22     False
23     False
24     False
25     False
26     False
27     False
28     False
29     False
       ...  
563    False
564    False
565    False
566    False
567    False
568    False
569    False
570    False
571    False
572    False
573    False
574    False
575    False
576    False
577    False
578    False
579    False
580    False
581    False
582    False
583    False
584    False
585    False
586    False
587    False
588    False
589    False
590    False
591    False
592    False
Name: State, Length: 593, dtype: bool

This operation produced a Series of True/False booleans based on the county of each record.
This result can then be used inside of loc to select the relevant data:

In [28]:
data3.loc[data3== 'Alabama']

Unnamed: 0,State,County,Year,Days with AQI,Good Days,Moderate Days,Unhealthy for Sensitive Groups Days,Unhealthy Days,Very Unhealthy Days,Hazardous Days,Max AQI,90th Percentile AQI,Median AQI,Days CO,Days NO2,Days Ozone,Days SO2,Days PM2.5,Days PM10
0,Alabama,DeKalb,2018,58,58,0,0,0,0,0,44,37,31,0,0,58,0,0,0
1,Alabama,Etowah,2018,59,27,31,0,1,0,0,153,62,52,0,0,0,0,59,0
2,Alabama,Jefferson,2018,59,38,21,0,0,0,0,72,57,47,0,1,6,7,44,1
3,Alabama,Mobile,2018,59,55,4,0,0,0,0,53,38,17,0,0,0,20,39,0
4,Alabama,Morgan,2018,59,54,5,0,0,0,0,56,50,29,0,0,0,0,59,0
5,Alabama,Russell,2018,59,35,24,0,0,0,0,84,63,45,0,0,0,0,59,0
6,Alabama,Shelby,2018,58,57,0,1,0,0,0,124,9,0,0,0,0,58,0,0
7,Alabama,Sumter,2018,59,59,0,0,0,0,0,45,36,19,0,0,0,9,50,0
8,Alabama,Tuscaloosa,2018,54,43,11,0,0,0,0,62,54,38,0,0,0,0,54,0


Seems like first 9 entries are from Alabama!

We can use the ampersand (&) to bring the two conditions together:

In [35]:
data2.loc[(data2.State == 'Alabama') & (data2['Good Days']>40)]

Unnamed: 0,State,County,Year,Days with AQI,Good Days,Moderate Days,Unhealthy for Sensitive Groups Days,Unhealthy Days,Very Unhealthy Days,Hazardous Days,Max AQI,90th Percentile AQI,Median AQI,Days CO,Days NO2,Days Ozone,Days SO2,Days PM2.5,Days PM10
0,Alabama,DeKalb,2018,58,58,0,0,0,0,0,44,37,31,0,0,58,0,0,0
3,Alabama,Mobile,2018,59,55,4,0,0,0,0,53,38,17,0,0,0,20,39,0
4,Alabama,Morgan,2018,59,54,5,0,0,0,0,56,50,29,0,0,0,0,59,0
6,Alabama,Shelby,2018,58,57,0,1,0,0,0,124,9,0,0,0,0,58,0,0
7,Alabama,Sumter,2018,59,59,0,0,0,0,0,45,36,19,0,0,0,9,50,0
8,Alabama,Tuscaloosa,2018,54,43,11,0,0,0,0,62,54,38,0,0,0,0,54,0


And suppose want to know the data from Alabama or Good Days > 40, we can use a pipe ( | )

In [31]:
data2.loc[(data2.State == 'Alabama') | (data2['Good Days']>40)]

Unnamed: 0,State,County,Year,Days with AQI,Good Days,Moderate Days,Unhealthy for Sensitive Groups Days,Unhealthy Days,Very Unhealthy Days,Hazardous Days,Max AQI,90th Percentile AQI,Median AQI,Days CO,Days NO2,Days Ozone,Days SO2,Days PM2.5,Days PM10
0,Alabama,DeKalb,2018,58,58,0,0,0,0,0,44,37,31,0,0,58,0,0,0
1,Alabama,Etowah,2018,59,27,31,0,1,0,0,153,62,52,0,0,0,0,59,0
2,Alabama,Jefferson,2018,59,38,21,0,0,0,0,72,57,47,0,1,6,7,44,1
3,Alabama,Mobile,2018,59,55,4,0,0,0,0,53,38,17,0,0,0,20,39,0
4,Alabama,Morgan,2018,59,54,5,0,0,0,0,56,50,29,0,0,0,0,59,0
5,Alabama,Russell,2018,59,35,24,0,0,0,0,84,63,45,0,0,0,0,59,0
6,Alabama,Shelby,2018,58,57,0,1,0,0,0,124,9,0,0,0,0,58,0,0
7,Alabama,Sumter,2018,59,59,0,0,0,0,0,45,36,19,0,0,0,9,50,0
8,Alabama,Tuscaloosa,2018,54,43,11,0,0,0,0,62,54,38,0,0,0,0,54,0
9,Alaska,Denali,2018,90,90,0,0,0,0,0,49,44,39,0,0,90,0,0,0


pandas comes with a few pre-built conditional selectors, We will highlight the most useful two. The first one in isin.

The first is isin. isin is lets you select data whose value "is in" a list of values. 

For example, here's how we can use it to select data from Arizona and California

In [34]:
data2.loc[data2.State.isin(['Arizona', 'California'])]

Unnamed: 0,State,County,Year,Days with AQI,Good Days,Moderate Days,Unhealthy for Sensitive Groups Days,Unhealthy Days,Very Unhealthy Days,Hazardous Days,Max AQI,90th Percentile AQI,Median AQI,Days CO,Days NO2,Days Ozone,Days SO2,Days PM2.5,Days PM10
10,Arizona,Apache,2018,94,94,0,0,0,0,0,26,16,10,0,0,0,0,8,86
11,Arizona,Cochise,2018,90,85,5,0,0,0,0,63,49,44,0,0,78,0,1,11
12,Arizona,Coconino,2018,90,89,1,0,0,0,0,61,47,43,0,0,90,0,0,0
13,Arizona,Gila,2018,90,34,33,20,3,0,0,181,126,64,0,0,29,57,0,4
14,Arizona,La Paz,2018,90,87,3,0,0,0,0,67,48,41,0,0,87,0,3,0
15,Arizona,Maricopa,2018,90,17,71,1,0,1,0,249,80,62,0,6,21,0,26,37
16,Arizona,Mohave,2018,90,88,2,0,0,0,0,82,21,9,0,0,0,0,0,90
17,Arizona,Navajo,2018,90,89,1,0,0,0,0,54,47,40,0,0,86,0,0,4
18,Arizona,Pima,2018,90,71,19,0,0,0,0,85,56,39,0,0,0,0,0,90
19,Arizona,Pinal,2018,90,83,7,0,0,0,0,67,50,44,0,0,90,0,0,0


The second is isnull (and its companion notnull). These methods let you highlight values which are or are not empty (NaN). For example, to filter out data lacking a City Name in the dataset, here's what we would do:

In [39]:
data3.loc[data3['City Name'].isnull()]

Unnamed: 0,State Code,County Code,Site Num,Parameter Code,POC,Latitude,Longitude,Datum,Parameter Name,Sample Duration,...,75th Percentile,50th Percentile,10th Percentile,Local Site Name,Address,State Name,County Name,City Name,CBSA Name,Date of Last Change
59,1,73,1005,68102,1,33.331111,-87.003611,WGS84,Sample Volume,24 HOUR,...,24.000,24.000,24.000,McAdory,ROUTE 8 MCADORY,Alabama,Jefferson,,"Birmingham-Hoover, AL",2018-04-12
60,1,73,1005,68102,2,33.331111,-87.003611,WGS84,Sample Volume,24 HOUR,...,24.000,24.000,24.000,McAdory,ROUTE 8 MCADORY,Alabama,Jefferson,,"Birmingham-Hoover, AL",2018-04-12
61,1,73,1005,68105,1,33.331111,-87.003611,WGS84,Average Ambient Temperature,24 HOUR,...,16.800,10.900,-1.400,McAdory,ROUTE 8 MCADORY,Alabama,Jefferson,,"Birmingham-Hoover, AL",2018-04-12
62,1,73,1005,68105,2,33.331111,-87.003611,WGS84,Average Ambient Temperature,24 HOUR,...,14.000,12.800,-1.000,McAdory,ROUTE 8 MCADORY,Alabama,Jefferson,,"Birmingham-Hoover, AL",2018-04-12
63,1,73,1005,68108,1,33.331111,-87.003611,WGS84,Average Ambient Pressure,24 HOUR,...,757.000,751.000,745.000,McAdory,ROUTE 8 MCADORY,Alabama,Jefferson,,"Birmingham-Hoover, AL",2018-04-12
64,1,73,1005,68108,2,33.331111,-87.003611,WGS84,Average Ambient Pressure,24 HOUR,...,758.000,751.000,748.000,McAdory,ROUTE 8 MCADORY,Alabama,Jefferson,,"Birmingham-Hoover, AL",2018-04-12
65,1,73,1005,88101,1,33.331111,-87.003611,WGS84,PM2.5 - Local Conditions,24 HOUR,...,9.000,7.400,5.600,McAdory,ROUTE 8 MCADORY,Alabama,Jefferson,,"Birmingham-Hoover, AL",2018-04-12
66,1,73,1005,88101,1,33.331111,-87.003611,WGS84,PM2.5 - Local Conditions,24 HOUR,...,9.000,7.400,5.600,McAdory,ROUTE 8 MCADORY,Alabama,Jefferson,,"Birmingham-Hoover, AL",2018-04-12
67,1,73,1005,88101,1,33.331111,-87.003611,WGS84,PM2.5 - Local Conditions,24 HOUR,...,9.000,7.400,5.600,McAdory,ROUTE 8 MCADORY,Alabama,Jefferson,,"Birmingham-Hoover, AL",2018-04-12
68,1,73,1005,88101,1,33.331111,-87.003611,WGS84,PM2.5 - Local Conditions,24 HOUR,...,9.000,7.400,5.600,McAdory,ROUTE 8 MCADORY,Alabama,Jefferson,,"Birmingham-Hoover, AL",2018-04-12


### Assigning data
Going the other way, assigning data to a DataFrame is easy. You can assign either a constant value:

In [40]:
data3['Big Data'] = 'cool'
data3['Big Data']

0        cool
1        cool
2        cool
3        cool
4        cool
5        cool
6        cool
7        cool
8        cool
9        cool
10       cool
11       cool
12       cool
13       cool
14       cool
15       cool
16       cool
17       cool
18       cool
19       cool
20       cool
21       cool
22       cool
23       cool
24       cool
25       cool
26       cool
27       cool
28       cool
29       cool
         ... 
14787    cool
14788    cool
14789    cool
14790    cool
14791    cool
14792    cool
14793    cool
14794    cool
14795    cool
14796    cool
14797    cool
14798    cool
14799    cool
14800    cool
14801    cool
14802    cool
14803    cool
14804    cool
14805    cool
14806    cool
14807    cool
14808    cool
14809    cool
14810    cool
14811    cool
14812    cool
14813    cool
14814    cool
14815    cool
14816    cool
Name: Big Data, Length: 14817, dtype: object

Or with an iterable of values:

In [41]:
data1['index_backwards'] = range(len(data1), 0, -1)
data1['index_backwards']

0      316
1      315
2      314
3      313
4      312
5      311
6      310
7      309
8      308
9      307
10     306
11     305
12     304
13     303
14     302
15     301
16     300
17     299
18     298
19     297
20     296
21     295
22     294
23     293
24     292
25     291
26     290
27     289
28     288
29     287
      ... 
286     30
287     29
288     28
289     27
290     26
291     25
292     24
293     23
294     22
295     21
296     20
297     19
298     18
299     17
300     16
301     15
302     14
303     13
304     12
305     11
306     10
307      9
308      8
309      7
310      6
311      5
312      4
313      3
314      2
315      1
Name: index_backwards, Length: 316, dtype: int64

## There are way more you can do, keep exploring on your own!