# Pandas Jupyter Experiment

For this exercise, you will be choosing a dataset that is already available in CSV, using either the links suggested in the module or an external source. Note that the completeness and organization of the CSV will impact your success, so you might want to investigate it using our initial methods before you commit to it and your object of analysis.

I have provided the headings for each section: you can modify them to reflect your final workflow. Here's what you need to accomplish and document:

- Complete the five sequential stages of importing, analyzing, and visualizing your CSV data using the Pandas library. The headers are provided, but you will need to plan out and structure what happens in the code using a combination of our class exercise and the textbook for guidance.

- Create a well-structured, readable documentation for every cell of your Python code. Use Markdown (as demonstrated in this example) and preview the results on GitHub to confirm it works as intended.

- As a bonus exercise, output and save a meaningful, formatted visualization, following the examples in the textbook.

# Stage One: Import Libraries  / Data

THis first cell imports Pandas and changes the name to pd for shorthand.  It then opens a CSV file.  The CSV file must alredy be downloaded and present within the same folder.  Create a name for the file to use throughout the workflow.

For this workflow, I am using a CSV file of servicememebers who were awarded the Medal of Honor (MoH).  

In [1]:
import pandas as pd

moh_df = pd.read_csv('medal_of_honor.csv', delimiter=",")

moh_df

Unnamed: 0,death,name,awarded.General Order number,awarded.accredited to,awarded.citation,awarded.issued,birth.location name,metadata.link,military record.company,military record.division,...,awarded.date.day,awarded.date.full,awarded.date.month,awarded.date.year,awarded.location.latitude,awarded.location.longitude,awarded.location.name,birth.date.day,birth.date.month,birth.date.year
0,True,"Sagelhurst, John C.",-1,,Under a heavy fire from the enemy carried off ...,01/03/1906,"Buffalo, N.Y.",http://www.cmohs.org/recipient-detail/1176/sag...,Company B,1st New Jersey Cavalry,...,6,1865-2-6,2,1865,38,-77,"Hatchers Run Court, Stafford, VA 22554, USA",-1,-1,-1
1,True,"Hack, John",-1,,Was one of a party which volunteered and attem...,01/03/1907,"1843, Germany",http://www.cmohs.org/recipient-detail/537/hack...,Company B,47th Ohio Infantry,...,3,1863-5-3,5,1863,32,-90,"Vicksburg, MS, USA",-1,-1,-1
2,True,"Carson, Anthony J.",-1,,Assumed command of a detachment of the company...,01/04/1906,"Boston, Mass.",http://www.cmohs.org/recipient-detail/2211/car...,Company H,43d Infantry,...,-1,-1--1--1,-1,-1,10,124,"Leyte, Philippines",-1,-1,-1
3,True,"Defranzos, Arthur F.",1,,For conspicuous gallantry and intrepidity at t...,01/04/1945,"Saugus, Mass.",http://www.cmohs.org/recipient-detail/2710/def...,,1st Infantry Division,...,10,1944-6-10,6,1944,49,0,"14490 Vaubadon, France",-1,-1,-1
4,True,"Kessler, Patrick L.",1,,For conspicuous gallantry and intrepidity at r...,01/04/1945,"Middletown, Ohio",http://www.cmohs.org/recipient-detail/2824/kes...,Company K,"30th Infantry, 3d Infantry Division",...,23,1944-5-23,5,1944,43,11,"50026 Ponte Rotto FI, Italy",-1,-1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3470,True,"Yeager, Jacob F.",-1,,Seized a shell with fuze burning that had fall...,?,"Lehigh County, Pa.",http://www.cmohs.org/recipient-detail/1531/yea...,Company H,101st Ohio Infantry,...,11,1864-5-11,5,1864,31,-81,"Buzzards Roost, Brunswick, GA 31520, USA",-1,-1,-1
3471,True,"Yntema, Gordon Douglas",-1,,For conspicuous gallantry and intrepidity in a...,?,"Bethesda, Md.",http://www.cmohs.org/recipient-detail/3454/ynt...,Company D,5th Special Forces Group,...,-1,-1--1--1,-1,-1,10,106,Honorary Consulate General of the Republic of ...,26,6,1945
3472,True,"Young, Andrew J.",-1,,Capture of flag.,?,"Greene County, Pa.",http://www.cmohs.org/recipient-detail/1532/you...,Company F,1st Pennsylvania Cavalry,...,5,1865-4-5,4,1865,37,-78,"Virginia, USA",-1,-1,-1
3473,True,"Young, Benjamin F.",-1,,Capture of flag of 35th North Carolina Infantr...,?,"1844, Canada",http://www.cmohs.org/recipient-detail/1533/you...,Company I,1st Michigan Sharpshooters,...,17,1864-6-17,6,1864,37,-77,"Petersburg, VA, USA",-1,-1,-1


# Stage Two: Display a Summary and Sub-sections of the Data

This following 3 cells uses different codes to filter and view the data.  Section 1 prints a random sample of the data.  In this case, 10 random MoH earners.  Section 2 prints the minimum and maximum award year of MoH's.  THis should print the first year and last year of the award, per this dataset.  Section 3 prints the number of MoH awards by branch.  THis will show the distrobution of the award accross the branches.  

In [9]:
# Section 1
moh_df.sample(10)

Unnamed: 0,death,name,awarded.General Order number,awarded.accredited to,awarded.citation,awarded.issued,birth.location name,metadata.link,military record.company,military record.division,...,awarded.date.day,awarded.date.full,awarded.date.month,awarded.date.year,awarded.location.latitude,awarded.location.longitude,awarded.location.name,birth.date.day,birth.date.month,birth.date.year
3283,True,"Wai, Francis B.",-1,,Captain Francis B. Wai distinguished himself b...,?,,http://www.cmohs.org/recipient-detail/3043/wai...,,,...,-1,-1--1--1,-1,-1,-1,-1,Unknown,-1,-1,-1
2125,True,"Kraus, Richard Edward",-1,Minnesota,For conspicuous gallantry and intrepidity at t...,?,"Chicago, Ill.",http://www.cmohs.org/recipient-detail/2835/kra...,,,...,-1,-1--1--1,-1,-1,7,134,"Palau Islands, Ngaremlengui, Palau",24,11,1925
2593,True,"Orr, Moses",-1,,Gallant conduct during cam paigns and engageme...,?,Ireland,http://www.cmohs.org/recipient-detail/1827/orr...,Company A,1st U.S. Cavalry,...,-1,-1--1--1,-1,-1,37,-95,United States,-1,-1,-1
2170,True,"Lejeune, Emile",212,New York,"Serving on board the U.S.S. Plymouth, Lejeune ...",?,"1853, France",http://www.cmohs.org/recipient-detail/2032/lej...,,,...,-1,-1--1--1,-1,-1,32,-80,"Port Royal, SC, USA",-1,-1,-1
1874,True,"Holden, Henry",-1,,Brought up ammunition under a galling fire fro...,?,England,http://www.cmohs.org/recipient-detail/1701/hol...,Company D,7th U.S. Cavalry,...,25,1876-6-25,6,1876,45,-107,"Little Bighorn River, Montana, USA",-1,-1,-1
481,True,"Collier, John W.",86,,"Cpl. Collier, Company C, distinguished himself...",08/02/1951,"Worthington, Ky.",http://www.cmohs.org/recipient-detail/3097/col...,Company C,27th Infantry Regiment,...,19,1950-9-19,9,1950,35,127,South Korea,3,4,1929
3302,True,"Ward, Charles H.",-1,,Gallantry in action with Indians.,?,England,http://www.cmohs.org/recipient-detail/1930/war...,Company G,1st U.S. Cavalry,...,20,1869-10-20,10,1869,31,-109,"Chiricahua Mountains, Coronado National Forest...",-1,-1,-1
1581,True,"Fraser, William W.",-1,,"Gallantry in the charge of the ""volunteer stor...",?,Scotland,http://www.cmohs.org/recipient-detail/461/fras...,Company I,97th Illinois Infantry,...,22,1863-5-22,5,1863,32,-90,"Vicksburg, MS, USA",-1,-1,-1
2883,True,"Savage, Auzella",59,Massachusetts,On board the U.S.S. Santiago de Cuba in the as...,?,"1846, Maine",http://www.cmohs.org/recipient-detail/1185/sav...,,,...,-1,-1--1--1,-1,-1,33,-77,"Fort Fisher Historic Museum, 1610 South Fort F...",-1,-1,-1
2967,True,"Shutes, Henry",71,Maryland,Served as captain of the forecastle on board t...,?,"1804, Baltimore, Md.",http://www.cmohs.org/recipient-detail/1243/shu...,,,...,-1,-1--1--1,-1,-1,29,-90,"New Orleans, LA, USA",-1,-1,-1


In [10]:
# Section 2
print(moh_df['awarded.date.year'].min())
print(moh_df['awarded.date.year'].max())
# The -1 printed reflects negative data in these columns.  Potetially becasue award year was unavailable.

-1
2007


In [11]:
# Section 3
moh_df['military record.organization'].value_counts()


U.S. Army            2428
U.S. Navy             734
U.S. Marine Corps     295
U.S. Air Force         17
U.S. Coast Guard        1
Name: military record.organization, dtype: int64

# Stage Three: Clean Your Data

(Varies by dataset: you might replace missing values with a filler, replace shorthand from the dataset with readable language, and/or delete duplicates)

# Stage Four: Plot Your Data

(Try at least three different plots - bar, hist, box, area, scatter, hexbin, and pie are all worth viewing - and describe your choices)

# Stage Five: Draw Comparisons and Make Claims

(Think of all the methods we've used so far - what might you analyze about this dataset? This section can mostly be commentary drawing on what you've found so far)

# Bonus: Export a Meaningful Visualization

(Use the guidance in the book to get started!)