# Exam 2

## 1. Introduction
The companion Google Form for this exam includes a set of questions that you can only answer by writing code. That code must be entered into this notebook.

Some requirements: 
1. This notebook must clearly **show all the work necessary to answer each question** in the Google Form.
1. **Your code must unambiguously show that you have answered each question**.  For questions where you are unable to determine the answer using code, include descriptive text explaining what you believe to be the answer, and any reason you are unable to show it in the code.
1. All code in this notebook must be **well-documented using Markdown**.  Indicate each question you are answering and clearly describe what each block of code is doing.  Your ability to write well-formatted Markdown that clearly explains your work will be factored into your grade.
1. Once you have determined the answer to each question, remember to **enter the answer for each question into the Google Form** - we need both this notebook and the completed Google Form in order to grade your work.
1. Save this notebook and **push this repository to GitHub when done**.


In [1]:
# given code... do not modify
%matplotlib inline
import numpy as np
import pandas as pd
from IPython.display import Markdown as md

df = pd.read_csv('./data/Production_Livestock_E_All_Data/Production_Livestock_E_All_Data.csv')

## 2. Data source and definitions
The data used in this notebook comes from the [Food and Agriculture Organizaion of the United Nations](http://www.fao.org/faostat/en/#data/QA) and represents counts of livestock produced in various areas around the world for every year from 1961-2019.

Several fields in this data may benefit from a bit of explanation.  You are welcome to review the Definitions of the fields by clicking the green `Definitions and standards` button present on website linked above.
- `Area` represents a country or region, e.g. `"Austria"` and `"Western Europe"`
- `Item` represents the specific type or a group of livestock, e.g. `"Sheep"` and `"Sheep and Goats"`
- `Unit` indicates whether the numbers in the yearly count fields represent counts of single animals (`"Head"` or `"No"`), 1000s of animals (`"1000 Head"`).
- yearly counts are in the fields named `Yxxxx`, where `xxxx` is a year, e.g. `Y1961` for the counts of a given animal in a given area in the year 1961.
- the field names following the format, `YxxxxF` indicate whether the count for a given year is an estimate or an actual count.  Estimates are indicated with an `F` or `Im` in this field, while actual counts have no value in this field.

## 3. Data sample

In [2]:
# given code... do not modify
df.sample(5)

Unnamed: 0,Area Code,Area,Item Code,Item,Element Code,Element,Unit,Y1961,Y1961F,Y1962,...,Y2015,Y2015F,Y2016,Y2016F,Y2017,Y2017F,Y2018,Y2018F,Y2019,Y2019F
1745,166,Panama,1034,Pigs,5111,Stocks,Head,222289.0,,204000.0,...,365000.0,,389000.0,,400700.0,,369200.0,,354500.0,
967,84,Greece,1749,Sheep and Goats,5111,Stocks,Head,14417000.0,A,13565000.0,...,12874296.0,A,12655739.0,A,12826025.0,A,12055000.0,A,12007000.0,A
791,209,Eswatini,1110,Mules,5111,Stocks,Head,509.0,,788.0,...,106.0,Im,108.0,Im,108.0,Im,109.0,Im,111.0,Im
1657,158,Niger,1016,Goats,5111,Stocks,Head,4940000.0,,5200000.0,...,15478902.0,,16098058.0,,16741979.0,,17411658.0,,18108124.0,
845,67,Finland,1096,Horses,5111,Stocks,Head,234710.0,,227500.0,...,74200.0,,74200.0,,74400.0,,74400.0,,,


# Your Work
Add all the code and documentation you need to answer the exam questions below this block. Keep your notebook organized so it is obvious which question you are answering (no need to repeat the entire question) and the order of answers follows the order of the questions in the exam.

The following section will contain code you need to write to answer questions from the Pandas section of the Google form.


## Question 1

For this question I simply created new dataframe that includes only values that satisfy my condition. My condition is that "Area" column is China, so that I exclude other Areas from my dataframe. After that I had to limit my dataframe to a Series where only to 2017 Year production which helped me to calculate the sum of livestock animals that were produced in China in the year 2017

In [18]:
df_china = df[df['Area'] == "China"]
df_china_2017 = df_china['Y2017']
df_china_2017.sum()

1258085411.0

## Question 2

In order to answer this question, firstly, I created new dataframe which only had values satisfying condtion 'Area' == 'United States of America'. Then I limited this dataframe only to Item 'Pigs'. Added new variable "difference" which is a Series of size one giving me the difference between number of Pigs in USA between 2019 and 1961. Then using the assign() function I added new column "Difference_2019_1961_pigs" to the df_usa_pigs dataframe to get the answer and data representation.

In [20]:
df_usa = df[df['Area'] == "United States of America"]
df_usa_pigs = df_usa[df_usa["Item"] == "Pigs"]
difference = df_usa_pigs["Y2019"] - df_usa_pigs["Y1961"]
df_usa_pigs = df_usa_pigs.assign(Difference_2019_1961_pigs = difference)
df_usa_pigs[["Area", "Item", "Difference_2019_1961_pigs"]]

Unnamed: 0,Area,Item,Difference_2019_1961_pigs
2451,United States of America,Pigs,23097600.0


## Question 3

The answer for this question is "None of the above". The code below is the right code.

In [5]:
df_usa[["Item", "Y2019"]]

Unnamed: 0,Item,Y2019
2443,Asses,51971.0
2444,Beehives,2812000.0
2445,Cattle,94804700.0
2446,Chickens,1972256.0
2447,Ducks,7365.0
2448,Goats,2622000.0
2449,Horses,10702799.0
2450,Mules,
2451,Pigs,78657600.0
2452,Sheep,5230000.0


## Question 4

In order to create a pandas DataFrame that contains the data from only those rows related to the USA, I used condition ['Area'] == 'United States of America' and output are rows that satisfy that condition. This new dataframe I called df_usa

In [6]:
df_usa = df[df['Area'] == 'United States of America']
df_usa

Unnamed: 0,Area Code,Area,Item Code,Item,Element Code,Element,Unit,Y1961,Y1961F,Y1962,...,Y2015,Y2015F,Y2016,Y2016F,Y2017,Y2017F,Y2018,Y2018F,Y2019,Y2019F
2443,231,United States of America,1107,Asses,5111,Stocks,Head,15000.0,F,15000.0,...,51986.0,Im,51976.0,Im,51974.0,Im,51972.0,Im,51971.0,Im
2444,231,United States of America,1181,Beehives,5114,Stocks,No,5514000.0,,5506000.0,...,2660000.0,,2775000.0,,2684000.0,,2828000.0,,2812000.0,
2445,231,United States of America,866,Cattle,5111,Stocks,Head,97700000.0,,100369008.0,...,89143000.0,,91888000.0,,93624600.0,,94298000.0,,94804700.0,
2446,231,United States of America,1057,Chickens,5112,Stocks,1000 Head,751000.0,F,782000.0,...,1998146.0,Im,1972264.0,Im,1971919.0,Im,1972088.0,Im,1972256.0,Im
2447,231,United States of America,1068,Ducks,5112,Stocks,1000 Head,3500.0,F,3400.0,...,7568.0,Im,7337.0,Im,7278.0,Im,7327.0,Im,7365.0,Im
2448,231,United States of America,1016,Goats,5111,Stocks,Head,3473000.0,,3647000.0,...,2650000.0,,2615000.0,,2627000.0,,2639000.0,,2622000.0,
2449,231,United States of America,1096,Horses,5111,Stocks,Head,2367000.0,F,2130000.0,...,10267608.0,Im,10528668.0,Im,10557429.0,Im,10630350.0,Im,10702799.0,Im
2450,231,United States of America,1110,Mules,5111,Stocks,Head,20000.0,F,20000.0,...,0.0,F,0.0,F,0.0,F,,M,,M
2451,231,United States of America,1034,Pigs,5111,Stocks,Head,55560000.0,,61837008.0,...,68919300.0,,71345400.0,,73144900.0,,75070200.0,,78657600.0,
2452,231,United States of America,976,Sheep,5111,Stocks,Head,32725008.0,,30969008.0,...,5280000.0,,5295000.0,,5270000.0,,5265000.0,,5230000.0,


## Question 5

In order to check data type of column Yxxxx, I used dtypes function which gave me the answer "float64".

In [7]:
df['Y1961'].dtypes

dtype('float64')

## Question 6

Simply function del, deletes the whole column named 'Area Code' from the given dataframe (in this case df_usa). As a result the number of columns decreased by one. Before we had 125 columns, after del function we have 124 columns.

In [8]:
df_usa = df[df['Area'] == "United States of America"]
del df_usa[ 'Area Code' ]
df_usa

Unnamed: 0,Area,Item Code,Item,Element Code,Element,Unit,Y1961,Y1961F,Y1962,Y1962F,...,Y2015,Y2015F,Y2016,Y2016F,Y2017,Y2017F,Y2018,Y2018F,Y2019,Y2019F
2443,United States of America,1107,Asses,5111,Stocks,Head,15000.0,F,15000.0,F,...,51986.0,Im,51976.0,Im,51974.0,Im,51972.0,Im,51971.0,Im
2444,United States of America,1181,Beehives,5114,Stocks,No,5514000.0,,5506000.0,,...,2660000.0,,2775000.0,,2684000.0,,2828000.0,,2812000.0,
2445,United States of America,866,Cattle,5111,Stocks,Head,97700000.0,,100369008.0,,...,89143000.0,,91888000.0,,93624600.0,,94298000.0,,94804700.0,
2446,United States of America,1057,Chickens,5112,Stocks,1000 Head,751000.0,F,782000.0,F,...,1998146.0,Im,1972264.0,Im,1971919.0,Im,1972088.0,Im,1972256.0,Im
2447,United States of America,1068,Ducks,5112,Stocks,1000 Head,3500.0,F,3400.0,F,...,7568.0,Im,7337.0,Im,7278.0,Im,7327.0,Im,7365.0,Im
2448,United States of America,1016,Goats,5111,Stocks,Head,3473000.0,,3647000.0,,...,2650000.0,,2615000.0,,2627000.0,,2639000.0,,2622000.0,
2449,United States of America,1096,Horses,5111,Stocks,Head,2367000.0,F,2130000.0,F,...,10267608.0,Im,10528668.0,Im,10557429.0,Im,10630350.0,Im,10702799.0,Im
2450,United States of America,1110,Mules,5111,Stocks,Head,20000.0,F,20000.0,F,...,0.0,F,0.0,F,0.0,F,,M,,M
2451,United States of America,1034,Pigs,5111,Stocks,Head,55560000.0,,61837008.0,,...,68919300.0,,71345400.0,,73144900.0,,75070200.0,,78657600.0,
2452,United States of America,976,Sheep,5111,Stocks,Head,32725008.0,,30969008.0,,...,5280000.0,,5295000.0,,5270000.0,,5265000.0,,5230000.0,
