### Table of Contents:
1. Introduction
2. Installation
3. Basic Concepts
4. Creating DataFrames
5. View our Data
6. Selection and Indexing
7. Handling Missing Data
8. Data Operations
9. Groupby Operations
10. Merging and Joining DataFrames
11. Working with dates and times
12. Inputs and Outputs
13. Visualizations
14. Advanced Topics

#### 1. Introduction:
- open source library
- data manipulation and data analysis tools in Python
- high performance
- easy to use
- built on top of Numpy

#### 2. Installation:
 - 'pip install pandas'

In [1]:
! pip install pandas

Defaulting to user installation because normal site-packages is not writeable


#### 3. Basic Concepts:
- **Series** - one dimensional array capable of holding any datatype.
- **DataFrame** - Two dimensional labeled data structure with columns able to hold any data type.

#### 4. Creating a dataframes

In [2]:
# Import pandas so that you can use pandas in your file
import pandas as pd

In [3]:
# DataFrame from a dictionary
dict_1 = {
    'Name': ['John', 'Anna', 'Peter', 'Linda'],
    'Ages': [28, 24, 35, 31]
}
dict_1

{'Name': ['John', 'Anna', 'Peter', 'Linda'], 'Ages': [28, 24, 35, 31]}

In [4]:
df = pd.DataFrame(dict_1)
df

Unnamed: 0,Name,Ages
0,John,28
1,Anna,24
2,Peter,35
3,Linda,31


In [5]:
# Create DataFrames from a list of dictionaries
data = [{'Name': 'John', 'Age': 27}, {'Name': 'Riwaj', 'Age': 28}, {'Name': 'Anna', 'Age':30}]
data

[{'Name': 'John', 'Age': 27},
 {'Name': 'Riwaj', 'Age': 28},
 {'Name': 'Anna', 'Age': 30}]

In [6]:
df2 = pd.DataFrame(data)
df2

Unnamed: 0,Name,Age
0,John,27
1,Riwaj,28
2,Anna,30


In [7]:
# Creating dataframes from file - same as reading a file, or a data
data2 = pd.read_csv('data.csv')
data2


Unnamed: 0,Series_reference,Period,Data_value,Suppressed,STATUS,UNITS,Magnitude,Subject,Group,Series_title_1,Series_title_2,Series_title_3,Series_title_4,Series_title_5
0,BDCQ.SEA1AA,2011.06,80078.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
1,BDCQ.SEA1AA,2011.09,78324.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
2,BDCQ.SEA1AA,2011.12,85850.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
3,BDCQ.SEA1AA,2012.03,90743.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
4,BDCQ.SEA1AA,2012.06,81780.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22958,BDCQ.SEE3999A,2017.06,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,
22959,BDCQ.SEE3999A,2017.09,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,
22960,BDCQ.SEE3999A,2017.12,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,
22961,BDCQ.SEE3999A,2018.03,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,


In [8]:
data2 = pd.DataFrame(data2)
data2

Unnamed: 0,Series_reference,Period,Data_value,Suppressed,STATUS,UNITS,Magnitude,Subject,Group,Series_title_1,Series_title_2,Series_title_3,Series_title_4,Series_title_5
0,BDCQ.SEA1AA,2011.06,80078.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
1,BDCQ.SEA1AA,2011.09,78324.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
2,BDCQ.SEA1AA,2011.12,85850.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
3,BDCQ.SEA1AA,2012.03,90743.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
4,BDCQ.SEA1AA,2012.06,81780.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22958,BDCQ.SEE3999A,2017.06,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,
22959,BDCQ.SEE3999A,2017.09,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,
22960,BDCQ.SEE3999A,2017.12,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,
22961,BDCQ.SEE3999A,2018.03,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,


#### 5. View our Data


In [9]:
# Display the first few rows of the data
print(data2.head)

<bound method NDFrame.head of       Series_reference   Period  Data_value Suppressed STATUS   UNITS  \
0          BDCQ.SEA1AA  2011.06     80078.0        NaN      F  Number   
1          BDCQ.SEA1AA  2011.09     78324.0        NaN      F  Number   
2          BDCQ.SEA1AA  2011.12     85850.0        NaN      F  Number   
3          BDCQ.SEA1AA  2012.03     90743.0        NaN      F  Number   
4          BDCQ.SEA1AA  2012.06     81780.0        NaN      F  Number   
...                ...      ...         ...        ...    ...     ...   
22958    BDCQ.SEE3999A  2017.06         NaN          Y      C  Number   
22959    BDCQ.SEE3999A  2017.09         NaN          Y      C  Number   
22960    BDCQ.SEE3999A  2017.12         NaN          Y      C  Number   
22961    BDCQ.SEE3999A  2018.03         NaN          Y      C  Number   
22962    BDCQ.SEE3999A  2018.06         NaN          Y      C  Number   

       Magnitude                         Subject  \
0              0  Business Data Collectio

In [10]:
data2.head()

Unnamed: 0,Series_reference,Period,Data_value,Suppressed,STATUS,UNITS,Magnitude,Subject,Group,Series_title_1,Series_title_2,Series_title_3,Series_title_4,Series_title_5
0,BDCQ.SEA1AA,2011.06,80078.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
1,BDCQ.SEA1AA,2011.09,78324.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
2,BDCQ.SEA1AA,2011.12,85850.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
3,BDCQ.SEA1AA,2012.03,90743.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
4,BDCQ.SEA1AA,2012.06,81780.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,


In [11]:
#View the Tail - the last rows of our data
data2.tail()

Unnamed: 0,Series_reference,Period,Data_value,Suppressed,STATUS,UNITS,Magnitude,Subject,Group,Series_title_1,Series_title_2,Series_title_3,Series_title_4,Series_title_5
22958,BDCQ.SEE3999A,2017.06,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,
22959,BDCQ.SEE3999A,2017.09,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,
22960,BDCQ.SEE3999A,2017.12,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,
22961,BDCQ.SEE3999A,2018.03,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,
22962,BDCQ.SEE3999A,2018.06,,Y,C,Number,0,Business Data Collection - BDC,Territorial authority by employment variable,Filled jobs (workplace location based),Area Outside Territorial Authority,Actual,,


In [12]:
#How to display the basic information of your data
data2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 22963 entries, 0 to 22962
Data columns (total 14 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   Series_reference  22963 non-null  object 
 1   Period            22963 non-null  float64
 2   Data_value        20124 non-null  float64
 3   Suppressed        2839 non-null   object 
 4   STATUS            22963 non-null  object 
 5   UNITS             22963 non-null  object 
 6   Magnitude         22963 non-null  int64  
 7   Subject           22963 non-null  object 
 8   Group             22963 non-null  object 
 9   Series_title_1    22963 non-null  object 
 10  Series_title_2    22963 non-null  object 
 11  Series_title_3    22963 non-null  object 
 12  Series_title_4    0 non-null      float64
 13  Series_title_5    0 non-null      float64
dtypes: float64(4), int64(1), object(9)
memory usage: 2.5+ MB


In [13]:
#Describing the summary statistics of the data
data2.describe()

#only describes the numerical data but not the categorical data

Unnamed: 0,Period,Data_value,Magnitude,Series_title_4,Series_title_5
count,22963.0,20124.0,22963.0,0.0,0.0
mean,2017.121172,72687.83,2.310064,,
std,3.699584,210333.4,2.919651,,
min,2011.06,1.187001,0.0,,
25%,2014.06,1881.943,0.0,,
50%,2017.06,14107.5,0.0,,
75%,2020.09,60346.0,6.0,,
max,2024.03,2321295.0,6.0,,


#### 6. Performing selection and indexing


In [14]:
#Selecting a column
data2['Data_value']

0        80078.0
1        78324.0
2        85850.0
3        90743.0
4        81780.0
          ...   
22958        NaN
22959        NaN
22960        NaN
22961        NaN
22962        NaN
Name: Data_value, Length: 22963, dtype: float64

In [15]:
data2['Series_title_4']

0       NaN
1       NaN
2       NaN
3       NaN
4       NaN
         ..
22958   NaN
22959   NaN
22960   NaN
22961   NaN
22962   NaN
Name: Series_title_4, Length: 22963, dtype: float64

In [16]:
#Selecting multiple columns
data2[['Period', 'Subject']]

Unnamed: 0,Period,Subject
0,2011.06,Business Data Collection - BDC
1,2011.09,Business Data Collection - BDC
2,2011.12,Business Data Collection - BDC
3,2012.03,Business Data Collection - BDC
4,2012.06,Business Data Collection - BDC
...,...,...
22958,2017.06,Business Data Collection - BDC
22959,2017.09,Business Data Collection - BDC
22960,2017.12,Business Data Collection - BDC
22961,2018.03,Business Data Collection - BDC


In [18]:
#Selecting rows by index
data2.iloc[1]

Series_reference                          BDCQ.SEA1AA
Period                                        2011.09
Data_value                                    78324.0
Suppressed                                        NaN
STATUS                                              F
UNITS                                          Number
Magnitude                                           0
Subject                Business Data Collection - BDC
Group                 Industry by employment variable
Series_title_1                            Filled jobs
Series_title_2      Agriculture, Forestry and Fishing
Series_title_3                                 Actual
Series_title_4                                    NaN
Series_title_5                                    NaN
Name: 1, dtype: object

In [19]:
#Selecting the first 2 rows by index
data2.iloc[0:2]

Unnamed: 0,Series_reference,Period,Data_value,Suppressed,STATUS,UNITS,Magnitude,Subject,Group,Series_title_1,Series_title_2,Series_title_3,Series_title_4,Series_title_5
0,BDCQ.SEA1AA,2011.06,80078.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
1,BDCQ.SEA1AA,2011.09,78324.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,


In [20]:
data2.iloc[0:100:3]

Unnamed: 0,Series_reference,Period,Data_value,Suppressed,STATUS,UNITS,Magnitude,Subject,Group,Series_title_1,Series_title_2,Series_title_3,Series_title_4,Series_title_5
0,BDCQ.SEA1AA,2011.06,80078.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
3,BDCQ.SEA1AA,2012.03,90743.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
6,BDCQ.SEA1AA,2012.12,87793.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
9,BDCQ.SEA1AA,2013.09,81471.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
12,BDCQ.SEA1AA,2014.06,85879.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
15,BDCQ.SEA1AA,2015.03,98202.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
18,BDCQ.SEA1AA,2015.12,96848.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
21,BDCQ.SEA1AA,2016.09,85933.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
24,BDCQ.SEA1AA,2017.06,90510.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,
27,BDCQ.SEA1AA,2018.03,101168.0,,F,Number,0,Business Data Collection - BDC,Industry by employment variable,Filled jobs,"Agriculture, Forestry and Fishing",Actual,,


In [21]:
#Selecting rows and columns by labels
data2.loc[0:2, ['Period', 'Data_value']]

Unnamed: 0,Period,Data_value
0,2011.06,80078.0
1,2011.09,78324.0
2,2011.12,85850.0


In [22]:
data2.iloc[0:4][['Period', 'Subject']]

Unnamed: 0,Period,Subject
0,2011.06,Business Data Collection - BDC
1,2011.09,Business Data Collection - BDC
2,2011.12,Business Data Collection - BDC
3,2012.03,Business Data Collection - BDC


- **iloc**: is used to select rows and columns by integer index.
- **loc**: is used to select rows and columns by label/index name.

#### 7. Handling Missing Data

In [23]:
# Identifying the missing data
data2.isnull().sum()

Series_reference        0
Period                  0
Data_value           2839
Suppressed          20124
STATUS                  0
UNITS                   0
Magnitude               0
Subject                 0
Group                   0
Series_title_1          0
Series_title_2          0
Series_title_3          0
Series_title_4      22963
Series_title_5      22963
dtype: int64

In [24]:
#Drop missing values
data2.dropna(inplace=True)

In [25]:
data2.isnull().sum()

Series_reference    0
Period              0
Data_value          0
Suppressed          0
STATUS              0
UNITS               0
Magnitude           0
Subject             0
Group               0
Series_title_1      0
Series_title_2      0
Series_title_3      0
Series_title_4      0
Series_title_5      0
dtype: int64

In [26]:
data2.describe()

Unnamed: 0,Period,Data_value,Magnitude,Series_title_4,Series_title_5
count,0.0,0.0,0.0,0.0,0.0
mean,,,,,
std,,,,,
min,,,,,
25%,,,,,
50%,,,,,
75%,,,,,
max,,,,,


In [27]:
data2.head()

Unnamed: 0,Series_reference,Period,Data_value,Suppressed,STATUS,UNITS,Magnitude,Subject,Group,Series_title_1,Series_title_2,Series_title_3,Series_title_4,Series_title_5


In [28]:
data2.tail()

Unnamed: 0,Series_reference,Period,Data_value,Suppressed,STATUS,UNITS,Magnitude,Subject,Group,Series_title_1,Series_title_2,Series_title_3,Series_title_4,Series_title_5


In [29]:
data2.info

<bound method DataFrame.info of Empty DataFrame
Columns: [Series_reference, Period, Data_value, Suppressed, STATUS, UNITS, Magnitude, Subject, Group, Series_title_1, Series_title_2, Series_title_3, Series_title_4, Series_title_5]
Index: []>

In [30]:
data_test = {
    'Name': ['Riwaj Bhurtel', 'Aayushma Gubhaju', 'Nischal Malla', 'Muna Sunar'],
    'Age': [24, 23, 21, 21],
}
data_test = pd.DataFrame(data_test)
data_test

Unnamed: 0,Name,Age
0,Riwaj Bhurtel,24
1,Aayushma Gubhaju,23
2,Nischal Malla,21
3,Muna Sunar,21


In [31]:
data_test2 = [{'Name': 'Anna', 'Age': 24}, {'Name': 'Riwaj', 'Age': 23}, {'Name': 'Nischal', 'Address': 'Pokhara'}]
data_test2 = pd.DataFrame(data_test2)
data_test2

Unnamed: 0,Name,Age,Address
0,Anna,24.0,
1,Riwaj,23.0,
2,Nischal,,Pokhara


In [32]:
data_test2.isnull().sum()

Name       0
Age        1
Address    2
dtype: int64

In [33]:
data_test2.fillna(value = {'Age': 0}, inplace=True)

In [44]:
data_test2

Unnamed: 0,Name,Age,Address
0,Anna,24.0,
1,Riwaj,23.0,
2,Nischal,0.0,Pokhara


In [34]:
#Adding a new column
data_test2['Salary'] = [5000, 3000, 15000]
data_test2

Unnamed: 0,Name,Age,Address,Salary
0,Anna,24.0,,5000
1,Riwaj,23.0,,3000
2,Nischal,0.0,Pokhara,15000


In [35]:
data_test2.fillna(value={'Address': 'Pokhara'}, inplace=True)
data_test2

Unnamed: 0,Name,Age,Address,Salary
0,Anna,24.0,Pokhara,5000
1,Riwaj,23.0,Pokhara,3000
2,Nischal,0.0,Pokhara,15000


- **dropna**: it deletes all the rows containing the NaN value.


#### 8. Data Operations

In [36]:
#adding a new column
data_test2['Phone Number'] = [9869100000, 9869146101, 9846000000]
data_test2

Unnamed: 0,Name,Age,Address,Salary,Phone Number
0,Anna,24.0,Pokhara,5000,9869100000
1,Riwaj,23.0,Pokhara,3000,9869146101
2,Nischal,0.0,Pokhara,15000,9846000000


In [37]:
#Renaming a column
data_test2.rename(columns={'Phone Number': 'Contact'}, inplace=True)
data_test2

Unnamed: 0,Name,Age,Address,Salary,Contact
0,Anna,24.0,Pokhara,5000,9869100000
1,Riwaj,23.0,Pokhara,3000,9869146101
2,Nischal,0.0,Pokhara,15000,9846000000


In [38]:
# How to apply functions
# Add 1 dependent to all
data_test2["Contact"] = data_test2['Contact'].apply(lambda x: x + 1)
data_test2


Unnamed: 0,Name,Age,Address,Salary,Contact
0,Anna,24.0,Pokhara,5000,9869100001
1,Riwaj,23.0,Pokhara,3000,9869146102
2,Nischal,0.0,Pokhara,15000,9846000001


#### 9. Groupby Operations

In [39]:
data = {"Name": ['John', 'Anna', 'John', "Anna", "Peter"],
        "Age": [24, 25, 24, 35, 23],
        "Score": [84, 85, 81, 75, 69]
        }
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,Score
0,John,24,84
1,Anna,25,85
2,John,24,81
3,Anna,35,75
4,Peter,23,69


In [40]:
# Group by Name
grouped = df.groupby('Name')
grouped

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x000001EF671369C0>

In [41]:
# Calculate the mean 
mean_score = grouped['Score'].mean()
mean_score

Name
Anna     80.0
John     82.5
Peter    69.0
Name: Score, dtype: float64

#### 10. Merging and Joining DataFrames

In [42]:
df2 = pd.DataFrame({'A': ['A0', 'A1', 'A2'],
                    'B': ['B1', 'B2', 'B3']},
                    index = ['K0', 'K1', 'K2'])
df2

Unnamed: 0,A,B
K0,A0,B1
K1,A1,B2
K2,A2,B3


In [43]:
df3 = pd.DataFrame({'C': ['C0', 'C1', 'C2'],
                    'D': ['D0', 'D1', 'D2']},
                    index = ['K0', 'K1', 'K2'])
df3

Unnamed: 0,C,D
K0,C0,D0
K1,C1,D1
K2,C2,D2


In [44]:
joined_df = df2.join(df3)
joined_df

Unnamed: 0,A,B,C,D
K0,A0,B1,C0,D0
K1,A1,B2,C1,D1
K2,A2,B3,C2,D2


#### 11. Working with Dates and Times

In [45]:
df

Unnamed: 0,Name,Age,Score
0,John,24,84
1,Anna,25,85
2,John,24,81
3,Anna,35,75
4,Peter,23,69


In [48]:
df['Date'] = pd.date_range(start= '1/1/2024', periods= len(df), freq= 'D')
df

Unnamed: 0,Name,Age,Score,Date
0,John,24,84,2024-01-01
1,Anna,25,85,2024-01-02
2,John,24,81,2024-01-03
3,Anna,35,75,2024-01-04
4,Peter,23,69,2024-01-05


In [49]:
# Converting to a datetime
df['Date'] = pd.to_datetime(df['Date'])
df

Unnamed: 0,Name,Age,Score,Date
0,John,24,84,2024-01-01
1,Anna,25,85,2024-01-02
2,John,24,81,2024-01-03
3,Anna,35,75,2024-01-04
4,Peter,23,69,2024-01-05


In [50]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   Name    5 non-null      object        
 1   Age     5 non-null      int64         
 2   Score   5 non-null      int64         
 3   Date    5 non-null      datetime64[ns]
dtypes: datetime64[ns](1), int64(2), object(1)
memory usage: 292.0+ bytes


In [51]:
# Extracting components of dates
df["Year"] = df['Date'].dt.year
df 

Unnamed: 0,Name,Age,Score,Date,Year
0,John,24,84,2024-01-01,2024
1,Anna,25,85,2024-01-02,2024
2,John,24,81,2024-01-03,2024
3,Anna,35,75,2024-01-04,2024
4,Peter,23,69,2024-01-05,2024


In [53]:
df['Month'] = df['Date'].dt.month
df['Day'] = df['Date'].dt.day
df

Unnamed: 0,Name,Age,Score,Date,Year,Month,Day
0,John,24,84,2024-01-01,2024,1,1
1,Anna,25,85,2024-01-02,2024,1,2
2,John,24,81,2024-01-03,2024,1,3
3,Anna,35,75,2024-01-04,2024,1,4
4,Peter,23,69,2024-01-05,2024,1,5


#### 12. Output

In [54]:
#Writing to a CSV file
df.to_csv('my_file.csv', index=False)