# Pandas exercises


Pandas is a python library that makes it easy to manipulate, analyse, clean and explore data.
The name "Pandas" refers to both "Panel Data" and "Python Data Analysis". 

## Exercise 1 - List-to-Series Conversion
Given a list, output the corresponding pandas series

In [22]:
import numpy as np
import pandas as pd

given_list = [2, 4, 5, 6, 9]

series = pd.Series(given_list)

print(series)

0    2
1    4
2    5
3    6
4    9
dtype: int64


## Exercise 2 - List-to-Series Conversion with Custom Indexing
Given a series, output the corresponding pandas series with odd indexes only

In [23]:
given_list = [2, 4, 5, 6, 9]

series = pd.Series(given_list, index = [1, 3, 5, 7, 9])

print(series)

1    2
3    4
5    5
7    6
9    9
dtype: int64


## Exercise 3 - Date Series Generation
Generate the series of dates from 1st May, 2021 to 12th May, 2021 (both inclusive)

In [24]:
date_series = pd.date_range(start = '05-01-2021', end = '05-12-2021')

print(date_series)

DatetimeIndex(['2021-05-01', '2021-05-02', '2021-05-03', '2021-05-04',
               '2021-05-05', '2021-05-06', '2021-05-07', '2021-05-08',
               '2021-05-09', '2021-05-10', '2021-05-11', '2021-05-12'],
              dtype='datetime64[ns]', freq='D')


## Exercise 4 - Dictionary-to-Dataframe Conversion
Given a dictionary, convert it into corresponding dataframe and display it


dictionary = {'name': ['Vinay', 'Kushal', 'Aman'],
              'age' : [22, 25, 24],
              'occ' : ['engineer', 'doctor', 'accountant']}

In [26]:
dictionary = {'name': ['Vinay', 'Kushal', 'Aman'],
              'age' : [22, 25, 24],
              'occ' : ['engineer', 'doctor', 'accountant']}

dataframe = pd.DataFrame(dictionary)

print(dataframe)


     name  age         occ
0   Vinay   22    engineer
1  Kushal   25      doctor
2    Aman   24  accountant


## Exercise 5- Setting Custom Index in Dataframe
Given a dataframe, change the index of a dataframe from the default indexes to a particular column.

Please use the dataframe generated in Exercise 4

In [27]:
dataframe_customindex = dataframe.set_index('age') # custom indexed dataframe with column, 'id'

print(dataframe_customindex)

       name         occ
age                    
22    Vinay    engineer
25   Kushal      doctor
24     Aman  accountant


#Exercise 6 - Sorting a Dataframe by Multiple Columns
Use the dataframe generated in Exercise 4, and sort it by multiple columns: 'name' and 'age'

In [28]:
dictionary = {'name': ['Vinay', 'Kushal', 'Aman'],
              'age' : [22, 25, 24],
              'occ' : ['engineer', 'doctor', 'accountant']}

dataframe = pd.DataFrame(dictionary)
dataframe_sorted = dataframe.sort_values(by = ['name', 'age']) # dataframe after sorting by 'id' and 'age'

print(dataframe_sorted)


     name  age         occ
2    Aman   24  accountant
1  Kushal   25      doctor
0   Vinay   22    engineer


##Exercise 7 - Conditional Selection of Rows in a DataFrame
Use the dataframe generated in Exercise 3, and select rows based on a condition : age > 24

In [29]:
dataframe_condition = dataframe.loc[dataframe.age >= 24]

print(dataframe_condition)

     name  age         occ
1  Kushal   25      doctor
2    Aman   24  accountant


## Exercise 8:
Pandas can read CSV, JSON, Excel, etc. files. Follow the following tutorial to learn more about the supported file formats (you will want to read all your data sources with Pandas afterwards):

https://pandas.pydata.org/docs/getting_started/intro_tutorials/02_read_write.html#min-tut-02-read-write

## Exercise 9 : Exploration of the salaries dataframe (optional)
Please use salaries.csv fourni, try to answer the following questions:

-1) What is the type of the salary column?

-2) What is the type of all columns?

-3) Select the salary column and display the highest value.

-4) Select the first 20 rows of the dataframe

-5) Display the last 2 rows of the dataframe

-6) Give the summary for the numeric columns in the dataset

-7) Calculate the standard deviation for each numeric column

-8) What is the average salary?

-9) What is the most answered grade in the dataframe?

-10) Structure your code to have two functions: One for reading and What is the least answered grade in the dataframe?

Documentation Pandas : https://pandas.pydata.org/docs/getting_started/index.html