# Pandas Exercises

This notebook contains various exercises to practice using the Pandas library in Python. 

### 1. Write a Python program to create and display a one-dimensional array-like object containing an array of data using Pandas module.


In [1]:
import numpy as np
import pandas as pd

In [2]:
numbers = [11,23,34,34,15,6,27,48,59]
pd.Series(numbers)

0    11
1    23
2    34
3    34
4    15
5     6
6    27
7    48
8    59
dtype: int64

### 2. Write a Python program to convert a Panda module Series to Python list and it's type.

In [3]:
series = pd.Series([10, 20, 30, 40, 50])
print("Type before:", type(series))

python_list = series.tolist()

print("Python List:", python_list)
print("Type after:", type(python_list))

Type before: <class 'pandas.core.series.Series'>
Python List: [10, 20, 30, 40, 50]
Type after: <class 'list'>


3. Write a Python program to add, subtract, multiple and divide two Pandas Series. Sample Series: [2, 4, 6, 8, 10], [1, 3, 5, 7, 9]


In [4]:
series1 = pd.Series([2, 4, 6, 8, 10])
series2 = pd.Series([1, 3, 5, 7, 9])

# Adding the two Series
addition_result = series1 + series2

# Subtracting the two Series
subtraction_result = series1 - series2

# Multiplying the two Series
multiplication_result = series1 * series2

# Dividing the two Series
division_result = series1 / series2

# Printing the results
print("Series 1:\n", series1)
print("\nSeries 2:\n", series2)
print("\nAddition:\n", addition_result)
print("\nSubtraction:\n", subtraction_result)
print("\nMultiplication:\n", multiplication_result)
print("\nDivision:\n", division_result)

Series 1:
 0     2
1     4
2     6
3     8
4    10
dtype: int64

Series 2:
 0    1
1    3
2    5
3    7
4    9
dtype: int64

Addition:
 0     3
1     7
2    11
3    15
4    19
dtype: int64

Subtraction:
 0    1
1    1
2    1
3    1
4    1
dtype: int64

Multiplication:
 0     2
1    12
2    30
3    56
4    90
dtype: int64

Division:
 0    2.000000
1    1.333333
2    1.200000
3    1.142857
4    1.111111
dtype: float64


### 4. Write a Python program to get the powers of an array values element-wise. Note: First array elements raised to powers from second array
Expected Output:
Original array
[0 1 2 3 4 5 6]
First array elements raised to powers from second array, element-wise: [ 0 1 8 27 64 125 216]

In [5]:
base_array = pd.Series([0, 1, 2, 3, 4, 5, 6])
exponent_array = pd.Series([0, 1, 2, 3, 4, 5, 6])

# Computing the powers element-wise
result = base_array ** exponent_array

# Printing the original arrays and the result
print("Original base array:\n", base_array.values)
print("Original exponent array:\n", exponent_array.values)
print("\nFirst array elements raised to powers from second array, element-wise:\n", result.values)

Original base array:
 [0 1 2 3 4 5 6]
Original exponent array:
 [0 1 2 3 4 5 6]

First array elements raised to powers from second array, element-wise:
 [    1     1     4    27   256  3125 46656]



### 5. Write a Python program to create and display a DataFrame from a specified dictionary data which has the index labels.
Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']} labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

In [6]:
exam_data = {
    'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
    'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
    'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
    'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']
}

# List of index labels
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

# Creating the DataFrame with specified index labels
df = pd.DataFrame(exam_data, index=labels)

# Displaying the DataFrame
print("DataFrame:")
print(df)


DataFrame:
        name  score  attempts qualify
a  Anastasia   12.5         1     yes
b       Dima    9.0         3      no
c  Katherine   16.5         2     yes
d      James    NaN         3      no
e      Emily    9.0         2      no
f    Michael   20.0         3     yes
g    Matthew   14.5         1     yes
h      Laura    NaN         1      no
i      Kevin    8.0         2      no
j      Jonas   19.0         1     yes



### 6. Write a Python program to display a summary of the basic information about a specified Data Frame and its data.


In [7]:
print("Basic Information about the DataFrame:\n")
df.info()

# Displaying a statistical summary of the DataFrame
print("\nStatistical Summary of the DataFrame:\n")
print(df.describe())

# Displaying a preview of the DataFrame
print("\n\n\nPreview of the DataFrame:\n")
print(df.head())

Basic Information about the DataFrame:

<class 'pandas.core.frame.DataFrame'>
Index: 10 entries, a to j
Data columns (total 4 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   name      10 non-null     object 
 1   score     8 non-null      float64
 2   attempts  10 non-null     int64  
 3   qualify   10 non-null     object 
dtypes: float64(1), int64(1), object(2)
memory usage: 400.0+ bytes

Statistical Summary of the DataFrame:

           score   attempts
count   8.000000  10.000000
mean   13.562500   1.900000
std     4.693746   0.875595
min     8.000000   1.000000
25%     9.000000   1.000000
50%    13.500000   2.000000
75%    17.125000   2.750000
max    20.000000   3.000000



Preview of the DataFrame:

        name  score  attempts qualify
a  Anastasia   12.5         1     yes
b       Dima    9.0         3      no
c  Katherine   16.5         2     yes
d      James    NaN         3      no
e      Emily    9.0         2      no


### 7. Write a Python program to get the first 3 rows of a given DataFrame.



In [8]:
df.head(3)

Unnamed: 0,name,score,attempts,qualify
a,Anastasia,12.5,1,yes
b,Dima,9.0,3,no
c,Katherine,16.5,2,yes


### 8. Write a Python program to select the 'name' and 'score' columns from the following DataFrame.
Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

In [9]:
df[['name', 'score']]

Unnamed: 0,name,score
a,Anastasia,12.5
b,Dima,9.0
c,Katherine,16.5
d,James,
e,Emily,9.0
f,Michael,20.0
g,Matthew,14.5
h,Laura,
i,Kevin,8.0
j,Jonas,19.0


### 9. Write a Python program to select the specified columns and rows from a given data frame.
Select 'name' and 'score' columns in rows 1, 3, 5, 6 from the given data frame.

In [10]:
selected_rows = [1, 3, 5, 6]  # zero-based index
selected_columns = ['name', 'score']
selected_data = df.iloc[[1, 3, 5, 6]][['name', 'score']]

# Displaying the selected data
print("Selected Data:\n")
print(selected_data)

Selected Data:

      name  score
b     Dima    9.0
d    James    NaN
f  Michael   20.0
g  Matthew   14.5


### 10. Write a Python program to select the rows where the number of attempts in the examination is greater than 2.

In [11]:
df[df['attempts'] > 2]

Unnamed: 0,name,score,attempts,qualify
b,Dima,9.0,3,no
d,James,,3,no
f,Michael,20.0,3,yes


### 11. Write a Python program to count the number of rows and columns of a DataFrame. Sample data:

In [12]:
print(f'Number of columns: {df.shape[1]}')
print(f'Number of rows: {df.shape[0]}')

Number of columns: 4
Number of rows: 10


### 12. Write a Python program to select the rows where the score is missing, i.e. is NaN.

In [13]:
df[df['score'].isna()]

Unnamed: 0,name,score,attempts,qualify
d,James,,3,no
h,Laura,,1,no


### 13. Write a Python program to select the rows where number of attempts in the examination is less than 2 and score greater than 15.

In [14]:
df[(df['attempts']<2) & (df['score']>15)]

Unnamed: 0,name,score,attempts,qualify
j,Jonas,19.0,1,yes


### 14. Write a Python program to change the score in row 'd' to 11.5.

In [15]:
df.loc['d', 'score']=11.5
df

Unnamed: 0,name,score,attempts,qualify
a,Anastasia,12.5,1,yes
b,Dima,9.0,3,no
c,Katherine,16.5,2,yes
d,James,11.5,3,no
e,Emily,9.0,2,no
f,Michael,20.0,3,yes
g,Matthew,14.5,1,yes
h,Laura,,1,no
i,Kevin,8.0,2,no
j,Jonas,19.0,1,yes


### 15. Write a Python program to calculate the sum of the examination attempts by the students.

In [16]:
df['attempts'].sum()

19

### 16. Write a Python program to calculate the mean score for each different student in DataFrame.

In [17]:
df['score'].mean()


13.333333333333334

### 17. Write a Python program to append a new row 'k' to data frame with given values for each column. Now delete the new row and return the original DataFrame.

In [18]:
print(f'Before add new entry: \n{df}')
df.loc['k'] = ['Alex', 15, 2, 'yes']
print(f'after add new entry: \n{df}')
df=df.drop('k')
print(f'after remove last entry: \n{df}')





Before add new entry: 
        name  score  attempts qualify
a  Anastasia   12.5         1     yes
b       Dima    9.0         3      no
c  Katherine   16.5         2     yes
d      James   11.5         3      no
e      Emily    9.0         2      no
f    Michael   20.0         3     yes
g    Matthew   14.5         1     yes
h      Laura    NaN         1      no
i      Kevin    8.0         2      no
j      Jonas   19.0         1     yes
after add new entry: 
        name  score  attempts qualify
a  Anastasia   12.5         1     yes
b       Dima    9.0         3      no
c  Katherine   16.5         2     yes
d      James   11.5         3      no
e      Emily    9.0         2      no
f    Michael   20.0         3     yes
g    Matthew   14.5         1     yes
h      Laura    NaN         1      no
i      Kevin    8.0         2      no
j      Jonas   19.0         1     yes
k       Alex   15.0         2     yes
after remove last entry: 
        name  score  attempts qualify
a  Anastasia   12

### 18. Write a Python program to sort the DataFrame first by 'name' in descending order, then by 'score' in ascending order.

In [19]:
df.sort_values(by=['name', 'score'], ascending=[False, True])


Unnamed: 0,name,score,attempts,qualify
f,Michael,20.0,3,yes
g,Matthew,14.5,1,yes
h,Laura,,1,no
i,Kevin,8.0,2,no
c,Katherine,16.5,2,yes
j,Jonas,19.0,1,yes
d,James,11.5,3,no
e,Emily,9.0,2,no
b,Dima,9.0,3,no
a,Anastasia,12.5,1,yes


### 19. Write a Python program to replace the 'qualify' column contains the values 'yes' and 'no' with True and False.

In [20]:
df['qualify'] = df['qualify'].replace({'yes': True, 'no': False})
df

  df['qualify'] = df['qualify'].replace({'yes': True, 'no': False})


Unnamed: 0,name,score,attempts,qualify
a,Anastasia,12.5,1,True
b,Dima,9.0,3,False
c,Katherine,16.5,2,True
d,James,11.5,3,False
e,Emily,9.0,2,False
f,Michael,20.0,3,True
g,Matthew,14.5,1,True
h,Laura,,1,False
i,Kevin,8.0,2,False
j,Jonas,19.0,1,True


### 21. Write a Python program to insert a new column in existing DataFrame.

In [21]:
df['age'] = np.random.randint(18, 24, len(df))

df

Unnamed: 0,name,score,attempts,qualify,age
a,Anastasia,12.5,1,True,18
b,Dima,9.0,3,False,23
c,Katherine,16.5,2,True,20
d,James,11.5,3,False,18
e,Emily,9.0,2,False,18
f,Michael,20.0,3,True,20
g,Matthew,14.5,1,True,23
h,Laura,,1,False,23
i,Kevin,8.0,2,False,20
j,Jonas,19.0,1,True,22


### 22. Write a Python program to iterate over rows in a DataFrame.

In [22]:
for index, row in df.iterrows():
    print(f"Index: {index}, Name: {row['name']}, Score: {row['score']}")

Index: a, Name: Anastasia, Score: 12.5
Index: b, Name: Dima, Score: 9.0
Index: c, Name: Katherine, Score: 16.5
Index: d, Name: James, Score: 11.5
Index: e, Name: Emily, Score: 9.0
Index: f, Name: Michael, Score: 20.0
Index: g, Name: Matthew, Score: 14.5
Index: h, Name: Laura, Score: nan
Index: i, Name: Kevin, Score: 8.0
Index: j, Name: Jonas, Score: 19.0


### 23. Write a Python program to get list from DataFrame column headers.

In [23]:
df.columns

Index(['name', 'score', 'attempts', 'qualify', 'age'], dtype='object')