# Pandas Exercises

## Pandas Series Basics

#### Import the 'pandas' library to load it into the computer's memory

In [1]:
import pandas as pd

#### Create the **employee_names** list

In [4]:
employee_names =['Amy White', 'Jack Stewart', 'Richard Lauderdale', 'Sara Johnson']
employee_names

['Amy White', 'Jack Stewart', 'Richard Lauderdale', 'Sara Johnson']

#### Verify the **employee_names** object is a list.

In [5]:
type(employee_names)

list

#### Create a pandas Series object containing the elements from the **employee_names** list. Call it **employee_names_Series**.

In [6]:
employee_names_series = pd.Series(employee_names)
employee_names_series

0             Amy White
1          Jack Stewart
2    Richard Lauderdale
3          Sara Johnson
dtype: object

#### Confirm the object is of a Series type.

In [7]:
type(employee_names_series)

pandas.core.series.Series

#### Now, create a Series object directly by using the following structure: **pd.Series([...])**
Let the elements of the Series object be the following numbers: 5, 8, 3, and 10. Name the object **work_experience_yrs.**

In [9]:
work_experience_years = pd.Series([5,8,3,10])
work_experience_years

0     5
1     8
2     3
3    10
dtype: int64

#### Import the 'NumPy' module

In [10]:
import numpy as np

#### Create the **array_age** NumPy array object with these values 50, 53, 35, and 43

In [13]:
array_age = np.array([50, 53, 35, 43])
array_age

array([50, 53, 35, 43])

#### Verify the type of the **array_age** object:

In [14]:
type(array_age)

numpy.ndarray

#### Create a Series object called **series_age** from the NumPy array object **array_age** you just created.

In [16]:
series_age = pd.Series(array_age)
series_age

0    50
1    53
2    35
3    43
dtype: int32

#### Check the type of the newly created object.

In [17]:
type(series_age)

pandas.core.series.Series

#### Display the content of **series_age**.

In [22]:
print(series_age)

0    50
1    53
2    35
3    43
dtype: int32


==========

## Working with Series Attributes

#### For the following Series object:

In [None]:
work_experience_years = pd.Series([5,8,3,10])
work_experience_years

#### Return the values stored in **work_experience_years**.

In [24]:
work_experience_years.values

array([ 5,  8,  3, 10], dtype=int64)

#### Check the type of the returned object.

In [25]:
type(work_experience_years.values)

numpy.ndarray

#### Use an attribute to find the number of elements in the underlying data for work_experience_years

In [26]:
work_experience_years.count()
work_experience_years.size

4

#### Assign the following name to this Series: **Work Experience (Yrs.)**

In [27]:
work_experience_years.name = "Work Experience (Yrs.)"

#### Display the name of the Series work_experience_years

In [28]:
work_experience_years.name

'Work Experience (Yrs.)'

#### Display the Series itself, to see the name appear below the data values it contains.

In [29]:
work_experience_years

0     5
1     8
2     3
3    10
Name: Work Experience (Yrs.), dtype: int64

==========

## Using an Index in Pandas Series

#### Execute the following code cell to create a dictionary that includes data about the names of the employees as its *keys*, as well as their age as *values*.

In [31]:
workers_age = {'Amy White':50, 'Jack Stewart':53, 'Richard Lauderdale':35, 'Sara Johnson':43}
workers_age

{'Amy White': 50,
 'Jack Stewart': 53,
 'Richard Lauderdale': 35,
 'Sara Johnson': 43}

#### Verify the type of **workers_age** is a dictionary.

In [32]:
type(workers_age)

dict

#### Create a Series from **workers_age**, giving it the same name, and priting its content

In [33]:
workers_age = pd.Series(workers_age)
workers_age

Amy White             50
Jack Stewart          53
Richard Lauderdale    35
Sara Johnson          43
dtype: int64

#### Verify **workers_age** is a Series object.

In [34]:
type(workers_age)

pandas.core.series.Series

#### Retrieve the index of *workers_age*.

In [37]:
workers_age.index

Index(['Amy White', 'Jack Stewart', 'Richard Lauderdale', 'Sara Johnson'], dtype='object')

===========

## Label-based vs Position-based Indexing

#### Create a pandas Series object from a dictionary with keys "Martin" and "George" and values 8 and 5, respectively. Call this Series **employees_work_exp**, as from "workers work experience".

In [38]:
employees_work_exp = pd.Series({'Martin': 8, 'George': 5})
employees_work_exp

Martin    8
George    5
dtype: int64

#### Retrieve the index values to see they are *labels*.

In [39]:
employees_work_exp.index

Index(['Martin', 'George'], dtype='object')

#### Extract the first of these values to prove they are strings.  

In [42]:
type(employees_work_exp.index[0])

str

#### Create a pandas Series object from an array that contains the following values: 44, 54, 65, 35. Call it **series_age**.

In [44]:
series_age = pd.Series([44, 54, 65, 35])
series_age

0    44
1    54
2    65
3    35
dtype: int64

#### Retrieve the index values of **series_age** to see they are numbers, thus representing positioned data.

In [46]:
series_age.index

RangeIndex(start=0, stop=4, step=1)

==========

## Using Pandas Series Methods

#### Consider the following Series object.

In [47]:
employees_work_exp = pd.Series({
'Amy White'   : 3,
'Jack Stewart'   : 5,
'Richard Lauderdale'  : 4.5,
'Sara Johnson'  : 22,
'Patrick Adams' : 28,
'Jessica Baker'  : 14,
'Peter Hunt'   : 4,
'Daniel Lloyd'  : 6,
'John Owen'   : 1.5,
'Jennifer Phillips'  : 10,
'Courtney Rogers'   : 4.5,
'Anne Robinson'  : 2,
})

#### Use a certain method to extract the top five values from this Series.

In [52]:
employees_work_exp.head()

Amy White              3.0
Jack Stewart           5.0
Richard Lauderdale     4.5
Sara Johnson          22.0
Patrick Adams         28.0
dtype: float64

#### Use a pandas method to retrieve the last four records of the object.

In [54]:
employees_work_exp.tail(4)

John Owen             1.5
Jennifer Phillips    10.0
Courtney Rogers       4.5
Anne Robinson         2.0
dtype: float64

==========

## Introduction to Pandas DataFrames

Create the following DataFrame in 4 different ways. (You don't need to think about assigning index values yet.)

|   	|               Name 	| Age 	| Working Experience (Yrs.) 	|
|--:	|-------------------:	|----:	|--------------------------:	|
| 0 	|          Amy White 	|  50 	|                         5 	|
| 1 	|       Jack Stewart 	|  53 	|                         8 	|
| 2 	| Richard Lauderdale 	|  35 	|                         3 	|
| 3 	|       Sara Johnson 	|  43 	|                        10 	|

#### Using a Dictionary of Lists

In [56]:
data = {
        "Name":['Amy White', 'Jack Stewart', 'Richard Lauderdale', 'Sara Johnson'],
        "Age": [50, 53, 35, 43],
        "Working Experience (Yrs.)": [5,8,3,10]
    }
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,Working Experience (Yrs.)
0,Amy White,50,5
1,Jack Stewart,53,8
2,Richard Lauderdale,35,3
3,Sara Johnson,43,10


#### Using a List of Dictionaries

In [64]:
data = [{'Name':'Amy White', 'Age':50, 'Working Experience (Yrs.)':5}, 
        {'Name':'Jack Stewart', 'Age':53, 'Working Experience (Yrs.)':8}, 
        {'Name':'Richard Lauderdale', 'Age':35, 'Working Experience (Yrs.)':3},
        {'Name':'Sara Johnson', 'Age':43, 'Working Experience (Yrs.)':10}]
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,Working Experience (Yrs.)
0,Amy White,50,5
1,Jack Stewart,53,8
2,Richard Lauderdale,35,3
3,Sara Johnson,43,10


#### Using a Dictionary of Series Objects

In [59]:
Names = pd.Series(['Amy White', 'Jack Stewart', 'Richard Lauderdale', 'Sara Johnson'])
Ages = pd.Series([50, 53, 35, 43])
Working_Experience_Yrs = pd.Series([5,8,3,10])

df = pd.DataFrame({'Name':Names, 'Age':Ages, 'Working Experience (Yrs.)':Working_Experience_Yrs})
df

Unnamed: 0,Name,Age,Working Experience (Yrs.)
0,Amy White,50,5
1,Jack Stewart,53,8
2,Richard Lauderdale,35,3
3,Sara Johnson,43,10


#### Using a List of Lists

In [61]:
data = [['Amy White', 50, 5], ['Jack Stewart', 53, 8], ['Richard Lauderdale', 35, 3], ['Sara Johnson', 43, 10]]
df = pd.DataFrame(data, columns=['Name', 'Age', 'Working Experience (Yrs.)'])
df

Unnamed: 0,Name,Age,Working Experience (Yrs.)
0,Amy White,50,5
1,Jack Stewart,53,8
2,Richard Lauderdale,35,3
3,Sara Johnson,43,10


#### Modify the code below to add integers starting from 1 in ascending order as index values.  

In [62]:
data = {
    "Name":['Amy White', 'Jack Stewart', 'Richard Lauderdale', 'Sara Johnson'], 
    "Age":[50, 53, 35, 43], 
    "Working Experience (Yrs.)":[5,8,3,10]}
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,Working Experience (Yrs.)
0,Amy White,50,5
1,Jack Stewart,53,8
2,Richard Lauderdale,35,3
3,Sara Johnson,43,10


becomes

In [63]:
data = {
    "Name":['Amy White', 'Jack Stewart', 'Richard Lauderdale', 'Sara Johnson'], 
    "Age":[50, 53, 35, 43], 
    "Working Experience (Yrs.)":[5,8,3,10]}
df = pd.DataFrame(data, index=[1,2,3,4])
df

Unnamed: 0,Name,Age,Working Experience (Yrs.)
1,Amy White,50,5
2,Jack Stewart,53,8
3,Richard Lauderdale,35,3
4,Sara Johnson,43,10


==========

# Good Luck!