## SERIES

### 1.Performing Series Comparisons in Pandas

To perform element-wise comparisons between two Series in pandas, you can use the comparison operators (`==`, `>`, `<`, `!=`, etc.) on the Series objects.

Here's an example code snippet that demonstrates different types of comparisons between two sample Series:

In [1]:
import pandas as pd

# Define the sample series
series1 = pd.Series([4, 65, 436, 3, 9])
series2 = pd.Series([7, 0, 3, 897, 9])

# Perform different comparisons
equal_comparison = series1 == series2
greater_comparison = series1 > series2
less_comparison = series1 < series2
not_equal_comparison = series1 != series2

# Print the comparison results
print("Equal Comparison:")
print(equal_comparison)
print("\nGreater Comparison:")
print(greater_comparison)
print("\nLess Comparison:")
print(less_comparison)
print("\nNot Equal Comparison:")
print(not_equal_comparison)


Equal Comparison:
0    False
1    False
2    False
3    False
4     True
dtype: bool

Greater Comparison:
0    False
1     True
2     True
3    False
4    False
dtype: bool

Less Comparison:
0     True
1    False
2    False
3     True
4    False
dtype: bool

Not Equal Comparison:
0     True
1     True
2     True
3     True
4    False
dtype: bool


### 2. Performing Arithmetic Operations on Series in Pandas

You can perform arithmetic operations between two Series in pandas by using mathematical operators (`+`, `-`, `*`, `/`) on the Series objects.

Here's an example code snippet that demonstrates different arithmetic operations between two sample Series:


In [2]:
import pandas as pd

# Define the sample series
series1 = pd.Series([2, 4, 6, 8, 14])
series2 = pd.Series([1, 3, 5, 7, 9])

# Addition
addition = series1 + series2

# Subtraction
subtraction = series1 - series2

# Multiplication
multiplication = series1 * series2

# Division
division = series1 / series2

# Print the results
print("Addition:")
print(addition)
print("\nSubtraction:")
print(subtraction)
print("\nMultiplication:")
print(multiplication)
print("\nDivision:")
print(division)


Addition:
0     3
1     7
2    11
3    15
4    23
dtype: int64

Subtraction:
0    1
1    1
2    1
3    1
4    5
dtype: int64

Multiplication:
0      2
1     12
2     30
3     56
4    126
dtype: int64

Division:
0    2.000000
1    1.333333
2    1.200000
3    1.142857
4    1.555556
dtype: float64


### 3. Converting a Dictionary to a Series in Pandas

In pandas, you can convert a dictionary to a Series using the `pd.Series()` function. The keys of the dictionary will become the index labels of the resulting Series, and the values will become the corresponding values in the Series.

Here's an example code snippet that demonstrates how to convert a sample dictionary to a Series:


In [4]:

# Define the sample dictionary
dictionary1 = {'Josh': 24, 'Sam': 36, 'Peace': 19, 'Charles': 65, 'Tom': 44}

# Convert the dictionary to a Series
series = pd.Series(dictionary1)

# Print the resulting Series
print(series)


Josh       24
Sam        36
Peace      19
Charles    65
Tom        44
dtype: int64


### 4. Converting a Pandas Series to a NumPy Array

In pandas, you can convert a Series to a NumPy array using the `to_numpy()` method. This method returns a new array containing the values from the Series.

Here's an example code snippet that demonstrates how to convert a sample Series to a NumPy array:


In [5]:
import pandas as pd
import numpy as np

# Define the sample series
series = pd.Series(['Love', 800, 'Joy', 789.9, 'Peace', True])

# Convert the series to an array
array = series.to_numpy()

# Print the resulting array
print(array)


['Love' 800 'Joy' 789.9 'Peace' True]


### 5. Replacing Values in a Pandas Series

In pandas, you can replace specific values in a Series using various methods. Here's an example code snippet that demonstrates how to replace values in a 'HomeTeam' column:


In [7]:
import pandas as pd

# Read the dataset into a DataFrame
df = pd.read_csv('AfricaCupofNationsMatches.csv')

# Extract the 'HomeTeam' column
home_team_goals = df['HomeTeam']

# Find the most frequent value
most_frequent = home_team_goals.mode()[0]

# Replace all other values with 'Other'
home_team_goals[home_team_goals != most_frequent] = 'Other'

# Print the modified series
print(home_team_goals)


0      Other
1      Other
2      Other
3      Other
4      Other
       ...  
617    Other
618    Other
619    Other
620    Other
621    Other
Name: HomeTeam, Length: 622, dtype: object


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  home_team_goals[home_team_goals != most_frequent] = 'Other'


## DATA FRAMES

### 1. Reading and Displaying a DataFrame

To read a CSV file into a DataFrame and display its contents, you can use the `pd.read_csv()` function from the pandas library.

Here's an example code snippet that reads a CSV file named 'AfricaCupofNationsMatches.csv' into a DataFrame and then displays its contents:

In [8]:

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Display the DataFrame
print(df3)


     Year                      Date   Time      HomeTeam       AwayTeam  \
0    1957                  10-Feb-57    NaN       Sudan           Egypt   
1    1957                  10-Feb-57    NaN    Ethiopia    South Africa   
2    1957                  16-Feb-57    NaN       Egypt        Ethiopia   
3    1959                  22-May-59    NaN        Egypt       Ethiopia   
4    1959                  25-May-59    NaN       Sudan        Ethiopia   
..    ...                        ...    ...          ...            ...   
617  2019  11 July 2019 (2019-07-11)  21:00  Madagascar         Tunisia   
618  2019  14 July 2019 (2019-07-14)  18:00     Senegal         Tunisia   
619  2019  14 July 2019 (2019-07-14)  21:00     Algeria         Nigeria   
620  2019  17 July 2019 (2019-07-17)  21:00     Tunisia         Nigeria   
621  2019  19 July 2019 (2019-07-19)  21:00     Senegal         Algeria   

     HomeTeamGoals  AwayTeamGoals                 Stage  \
0              1.0            2.0       

### 2. Getting the First N Rows of a DataFrame

To get the first N rows of a DataFrame, you can use the `.head(N)` method. It returns a new DataFrame containing only the first N rows.

Here's an example code snippet that retrieves the first 7 rows from a DataFrame:

In [9]:

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Get the first 7 rows of the DataFrame
first_7_rows = df3.head(7)

# Display the first 7 rows
print(first_7_rows)


   Year      Date  Time    HomeTeam       AwayTeam  HomeTeamGoals  \
0  1957  10-Feb-57   NaN     Sudan           Egypt            1.0   
1  1957  10-Feb-57   NaN  Ethiopia    South Africa            NaN   
2  1957  16-Feb-57   NaN     Egypt        Ethiopia            4.0   
3  1959  22-May-59   NaN      Egypt       Ethiopia            4.0   
4  1959  25-May-59   NaN     Sudan        Ethiopia            1.0   
5  1959  29-May-59   NaN      Egypt          Sudan            2.0   
6  1962  14-Jan-62   NaN  Ethiopia         Tunisia            4.0   

   AwayTeamGoals             Stage  \
0            2.0        Semifinals   
1            NaN        Semifinals   
2            0.0             Final   
3            0.0  Final Tournament   
4            0.0  Final Tournament   
5            1.0  Final Tournament   
6            2.0        Semifinals   

                                SpecialWinConditions                 Stadium  \
0                                                NaN       Mun

### 3. Selecting Specific Columns from a DataFrame

To select specific columns from a DataFrame, you can pass a list of column names as the indexing operator.

In [10]:
# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Select the desired columns
selected_columns = df3[['HomeTeam', 'AwayTeam', 'HomeTeamGoals', 'AwayTeamGoals']]

# Display the selected columns
print(selected_columns)

        HomeTeam       AwayTeam  HomeTeamGoals  AwayTeamGoals
0         Sudan           Egypt            1.0            2.0
1      Ethiopia    South Africa            NaN            NaN
2         Egypt        Ethiopia            4.0            0.0
3          Egypt       Ethiopia            4.0            0.0
4         Sudan        Ethiopia            1.0            0.0
..           ...            ...            ...            ...
617  Madagascar         Tunisia            0.0            3.0
618     Senegal         Tunisia            1.0            0.0
619     Algeria         Nigeria            2.0            1.0
620     Tunisia         Nigeria            0.0            1.0
621     Senegal         Algeria            0.0            1.0

[622 rows x 4 columns]


### 4. Selecting Rows Where a Team Appears in a DataFrame

To select the rows where a specific team appears in a DataFrame, you can use conditional filtering.

In [11]:

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Select the rows where Egypt appears
egypt_matches = df3[(df3['HomeTeam'] == 'Egypt') | (df3['AwayTeam'] == 'Egypt')]

# Display the selected rows
print(egypt_matches)


    Year      Date  Time    HomeTeam   AwayTeam  HomeTeamGoals  AwayTeamGoals  \
3   1959  22-May-59   NaN      Egypt   Ethiopia            4.0            0.0   
5   1959  29-May-59   NaN      Egypt      Sudan            2.0            1.0   
7   1962  18-Jan-62   NaN      Egypt     Uganda            2.0            1.0   
9   1962  21-Jan-62   NaN  Ethiopia       Egypt            4.0            2.0   
13  1963  24-Nov-63   NaN      Egypt    Nigeria            6.0            3.0   
14  1963  26-Nov-63   NaN      Egypt      Sudan            2.0            2.0   
16  1963  30-Nov-63   NaN      Egypt   Ethiopia            3.0            0.0   

                Stage    SpecialWinConditions                 Stadium  \
3    Final Tournament                     NaN   Prince Farouk Stadium   
5    Final Tournament                     NaN   Prince Farouk Stadium   
7          Semifinals                     NaN  Hailé Sélassié Stadium   
9               Final   win after extra time   Hailé Sélass

### 5. Counting Rows and Columns in a DataFrame

To count the number of rows and columns in a DataFrame, you can use the `.shape` attribute.

In [12]:

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Count the number of rows and columns
num_rows = df3.shape[0]
num_cols = df3.shape[1]

# Display the number of rows and columns
print("Number of rows:", num_rows)
print("Number of columns:", num_cols)


Number of rows: 622
Number of columns: 12


### 6. Selecting Rows with Missing Values in a DataFrame

To select specific rows in a DataFrame where a particular column has missing values, you can use the `.isnull()` method.

In [13]:
# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Select the rows where 'Attendance' is missing
missing_attendance = df3[df3['Attendance'].isnull()]

# Display the selected rows
print(missing_attendance)


     Year                         Date   Time        HomeTeam       AwayTeam  \
1    1957                     10-Feb-57    NaN      Ethiopia    South Africa   
8    1962                     20-Jan-62    NaN       Tunisia          Uganda   
9    1962                     21-Jan-62    NaN      Ethiopia           Egypt   
10   1963                     24-Nov-63    NaN         Ghana         Tunisia   
11   1963                     26-Nov-63    NaN         Ghana        Ethiopia   
..    ...                           ...    ...            ...            ...   
565  2017  29 January 2017 (2017-01-29)  20:00         Egypt         Morocco   
566  2017  1 February 2017 (2017-02-01)  20:00  Burkina Faso           Egypt   
567  2017  2 February 2017 (2017-02-02)  20:00      Cameroon           Ghana   
568  2017  4 February 2017 (2017-02-04)  20:00  Burkina Faso           Ghana   
569  2017  5 February 2017 (2017-02-05)  20:00         Egypt        Cameroon   

     HomeTeamGoals  AwayTeamGoals      

### 7. Selecting Rows Based on a Condition in a DataFrame

To select specific rows in a DataFrame based on a condition, you can use conditional statements within square brackets (`[]`).

In [14]:
import pandas as pd

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Select the rows where 'HomeTeamGoals' are between 3 and 6 inclusive
selected_rows = df3[(df3['HomeTeamGoals'] >= 3) & (df3['HomeTeamGoals'] <= 6)]

# Display the selected rows
print(selected_rows)


     Year                         Date   Time    HomeTeam     AwayTeam  \
2    1957                     16-Feb-57    NaN     Egypt      Ethiopia   
3    1959                     22-May-59    NaN      Egypt     Ethiopia   
6    1962                     14-Jan-62    NaN  Ethiopia       Tunisia   
8    1962                     20-Jan-62    NaN   Tunisia        Uganda   
9    1962                     21-Jan-62    NaN  Ethiopia         Egypt   
..    ...                           ...    ...        ...          ...   
553  2017  20 January 2017 (2017-01-20)  20:00   Morocco          Togo   
585  2019     27 June 2019 (2019-06-27)  22:00     Kenya      Tanzania   
595  2019     24 June 2019 (2019-06-24)  22:00      Mali    Mauritania   
608  2019      6 July 2019 (2019-07-06)  18:00   Nigeria      Cameroon   
611  2019      7 July 2019 (2019-07-07)  21:00   Algeria        Guinea   

     HomeTeamGoals  AwayTeamGoals              Stage    SpecialWinConditions  \
2              4.0            0

### 8. Modifying a Specific Value in a DataFrame

To change a specific value in a DataFrame, you can use the `.loc[]` accessor to access the specific row and column, and then assign a new value to it.

In [15]:
import pandas as pd

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Change the "AwayTeamGoals" value in the 3rd row to 10
df3.loc[2, 'AwayTeamGoals'] = 10

# Display the modified DataFrame
print(df3)


     Year                      Date   Time      HomeTeam       AwayTeam  \
0    1957                  10-Feb-57    NaN       Sudan           Egypt   
1    1957                  10-Feb-57    NaN    Ethiopia    South Africa   
2    1957                  16-Feb-57    NaN       Egypt        Ethiopia   
3    1959                  22-May-59    NaN        Egypt       Ethiopia   
4    1959                  25-May-59    NaN       Sudan        Ethiopia   
..    ...                        ...    ...          ...            ...   
617  2019  11 July 2019 (2019-07-11)  21:00  Madagascar         Tunisia   
618  2019  14 July 2019 (2019-07-14)  18:00     Senegal         Tunisia   
619  2019  14 July 2019 (2019-07-14)  21:00     Algeria         Nigeria   
620  2019  17 July 2019 (2019-07-17)  21:00     Tunisia         Nigeria   
621  2019  19 July 2019 (2019-07-19)  21:00     Senegal         Algeria   

     HomeTeamGoals  AwayTeamGoals                 Stage  \
0              1.0            2.0       

### 9. Sorting a DataFrame by Multiple Columns

To sort a DataFrame by one or more columns, you can use the `sort_values()` method of the DataFrame.

In [16]:
import pandas as pd

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Sort the DataFrame by 'HomeTeam' in ascending order, then by 'HomeTeamScores' in descending order
sorted_df = df3.sort_values(['HomeTeam', 'HomeTeamGoals'], ascending=[True, False])

# Display the sorted DataFrame
print(sorted_df)


     Year                         Date           Time    HomeTeam  \
205  1990                      2-Mar-90            NaN   Algeria    
29   1968                     14-Jan-68            NaN   Algeria    
135  1980                     16-Mar-80            NaN   Algeria    
164  1984                      5-Mar-84            NaN   Algeria    
171  1984                     17-Mar-84            NaN   Algeria    
..    ...                           ...            ...        ...   
521  2012                     12-Feb-12  20:30[note 1]    Zambia    
549  2017  23 January 2017 (2017-01-23)          20:00  Zimbabwe    
398  2004                     25-Jan-04          16:30  Zimbabwe    
437  2006                     23-Jan-06          20:00  Zimbabwe    
575  2019     30 June 2019 (2019-06-30)          21:00  Zimbabwe    

         AwayTeam  HomeTeamGoals  AwayTeamGoals              Stage  \
205       Nigeria            5.0            1.0            Group A   
29         Uganda            4.

### 10.Getting Column Headers of a DataFrame

To get a list of column headers (column names) of a DataFrame, you can use the `columns` attribute of the DataFrame.

In [17]:
import pandas as pd

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Get a list of column headers
column_headers = df3.columns.tolist()

# Display the list of column headers
print(column_headers)


['Year', 'Date ', 'Time ', 'HomeTeam', 'AwayTeam', 'HomeTeamGoals', 'AwayTeamGoals', 'Stage', 'SpecialWinConditions', 'Stadium', 'City', 'Attendance']


### 11. Adding a New Column to a DataFrame

To add a new column to a DataFrame, you can create a list or Series representing the values of the new column and assign it to a new column name in the DataFrame.

In [19]:
import pandas as pd

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Create a new column
new_column = ['Stadium A', 'Stadium B', 'Stadium C', 'Stadium D', 'Stadium E']

# Check if the length of new_column matches the number of rows in the DataFrame
if len(new_column) == len(df3):
    # Append the new column to the DataFrame
    df3['Stadium'] = new_column
    print(df3)
else:
    print("The length of new_column does not match the number of rows in the DataFrame.")


The length of new_column does not match the number of rows in the DataFrame.


### 12. Adding Rows to a DataFrame

To add new rows to a DataFrame, you can create new dictionaries representing the rows and then concatenate them with the original DataFrame using the `pd.concat()` function.

In [23]:
import pandas as pd

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Create two new rows as dictionaries
new_row1 = {'Year': 2023, 'Date': '10-Feb-23', 'HomeTeam': 'New Team 1', 'AwayTeam': 'New Team 2', 'HomeTeamGoals': 2, 'AwayTeamGoals': 3, 'Stage': 'Group Stage', 'SpecialWinConditions': 'N/A', 'Stadium': 'New Stadium', 'City': 'New City', 'Attendance': 10000}
new_row2 = {'Year': 2023, 'Date': '12-Feb-23', 'HomeTeam': 'New Team 3', 'AwayTeam': 'New Team 4', 'HomeTeamGoals': 1, 'AwayTeamGoals': 0, 'Stage': 'Group Stage', 'SpecialWinConditions': 'N/A', 'Stadium': 'New Stadium', 'City': 'New City', 'Attendance': 8000}

# Create a new DataFrame with the new rows
new_rows = pd.DataFrame([new_row1, new_row2])

# Concatenate the new DataFrame with the original DataFrame
df3 = pd.concat([df3, new_rows], ignore_index=True)

# Display the modified DataFrame
print(df3)


     Year                      Date   Time     HomeTeam       AwayTeam  \
0    1957                  10-Feb-57    NaN      Sudan           Egypt   
1    1957                  10-Feb-57    NaN   Ethiopia    South Africa   
2    1957                  16-Feb-57    NaN      Egypt        Ethiopia   
3    1959                  22-May-59    NaN       Egypt       Ethiopia   
4    1959                  25-May-59    NaN      Sudan        Ethiopia   
..    ...                        ...    ...         ...            ...   
619  2019  14 July 2019 (2019-07-14)  21:00    Algeria         Nigeria   
620  2019  17 July 2019 (2019-07-17)  21:00    Tunisia         Nigeria   
621  2019  19 July 2019 (2019-07-19)  21:00    Senegal         Algeria   
622  2023                        NaN    NaN  New Team 1     New Team 2   
623  2023                        NaN    NaN  New Team 3     New Team 4   

     HomeTeamGoals  AwayTeamGoals                 Stage  \
0              1.0            2.0            Semifin

### 13. Changing a Value in a DataFrame Column

To change a specific value in a DataFrame column, you can use the `replace()` function in pandas.

In [24]:
import pandas as pd

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Change 'Uganda' to 'China' in the 'AwayTeam' column
df3['AwayTeam'] = df3['AwayTeam'].replace('Uganda', 'China')

# Display the modified DataFrame
print(df3)


     Year                      Date   Time      HomeTeam       AwayTeam  \
0    1957                  10-Feb-57    NaN       Sudan           Egypt   
1    1957                  10-Feb-57    NaN    Ethiopia    South Africa   
2    1957                  16-Feb-57    NaN       Egypt        Ethiopia   
3    1959                  22-May-59    NaN        Egypt       Ethiopia   
4    1959                  25-May-59    NaN       Sudan        Ethiopia   
..    ...                        ...    ...          ...            ...   
617  2019  11 July 2019 (2019-07-11)  21:00  Madagascar         Tunisia   
618  2019  14 July 2019 (2019-07-14)  18:00     Senegal         Tunisia   
619  2019  14 July 2019 (2019-07-14)  21:00     Algeria         Nigeria   
620  2019  17 July 2019 (2019-07-17)  21:00     Tunisia         Nigeria   
621  2019  19 July 2019 (2019-07-19)  21:00     Senegal         Algeria   

     HomeTeamGoals  AwayTeamGoals                 Stage  \
0              1.0            2.0       

### 14. Resetting Index in a DataFrame

To reset the index of a DataFrame, you can use the `reset_index()` function in pandas.

In [25]:
import pandas as pd

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Reset the index of the DataFrame
df3 = df3.reset_index(drop=True)

# Display the modified DataFrame
print(df3)


     Year                      Date   Time      HomeTeam       AwayTeam  \
0    1957                  10-Feb-57    NaN       Sudan           Egypt   
1    1957                  10-Feb-57    NaN    Ethiopia    South Africa   
2    1957                  16-Feb-57    NaN       Egypt        Ethiopia   
3    1959                  22-May-59    NaN        Egypt       Ethiopia   
4    1959                  25-May-59    NaN       Sudan        Ethiopia   
..    ...                        ...    ...          ...            ...   
617  2019  11 July 2019 (2019-07-11)  21:00  Madagascar         Tunisia   
618  2019  14 July 2019 (2019-07-14)  18:00     Senegal         Tunisia   
619  2019  14 July 2019 (2019-07-14)  21:00     Algeria         Nigeria   
620  2019  17 July 2019 (2019-07-17)  21:00     Tunisia         Nigeria   
621  2019  19 July 2019 (2019-07-19)  21:00     Senegal         Algeria   

     HomeTeamGoals  AwayTeamGoals                 Stage  \
0              1.0            2.0       

### 15. Checking if 'Stadium' Column is Present in the DataFrame

To check if a specific column exists in a DataFrame, we can use the `in` operator along with the `.columns` attribute of the DataFrame. This allows us to verify whether the 'Stadium' column is present or not.


In [1]:
import pandas as pd

# Read the CSV file into a DataFrame
df3 = pd.read_csv('AfricaCupofNationsMatches.csv')

# Check if 'Stadium' column is present
if 'Stadium' in df3.columns:
    print("'Stadium' column is present in the DataFrame")
else:
    print("'Stadium' column is not present in the DataFrame")


'Stadium' column is present in the DataFrame


### 16. Explanation: Converting the Datatype of the 'AwayTeamGoals' Column

When working with data, it is important to ensure that the datatype of each column is appropriate for the data it contains. In this case, we are converting the datatype of the 'AwayTeamGoals' column from integer (int) to floating-point number (float).

Converting the datatype to float allows us to include decimal values in the column, which can be useful when dealing with goals or any other numerical values that may have fractional parts.

To perform the datatype conversion in Pandas, we can use the `astype()` method. This method allows us to specify the desired datatype for a column.



In [4]:
# Convert datatype of 'AwayTeamGoals' column to float
df3['AwayTeamGoals'] = df3['AwayTeamGoals'].astype(float)


### 17. Removing the Last 10 Rows from a DataFrame

To remove the last 10 rows from a DataFrame in Pandas, you can use the `drop` method with the appropriate index values. Here's an example:

```python
# Remove the last 10 rows from the DataFrame
df3 = df3.drop(df3.tail(10).index)

# Display the modified DataFrame
df3


In [5]:
# Remove the last 10 rows from the DataFrame
df3 = df3.drop(df3.tail(10).index)

# Display the modified DataFrame
df3


Unnamed: 0,Year,Date,Time,HomeTeam,AwayTeam,HomeTeamGoals,AwayTeamGoals,Stage,SpecialWinConditions,Stadium,City,Attendance
0,1957,10-Feb-57,,Sudan,Egypt,1.0,2.0,Semifinals,,Municipal Stadium,Khartoum,30000.0
1,1957,10-Feb-57,,Ethiopia,South Africa,,,Semifinals,Ethiopia wins due to disqualification of othe...,,,
2,1957,16-Feb-57,,Egypt,Ethiopia,4.0,0.0,Final,,Municipal Stadium,Khartoum,30000.0
3,1959,22-May-59,,Egypt,Ethiopia,4.0,0.0,Final Tournament,,Prince Farouk Stadium,Cairo,30000.0
4,1959,25-May-59,,Sudan,Ethiopia,1.0,0.0,Final Tournament,,Prince Farouk Stadium,Cairo,20000.0
...,...,...,...,...,...,...,...,...,...,...,...,...
607,2019,5 July 2019 (2019-07-05),21:00,Uganda,Senegal,0.0,1.0,Round of 16,,Cairo International Stadium,Cairo,6950.0
608,2019,6 July 2019 (2019-07-06),18:00,Nigeria,Cameroon,3.0,2.0,Round of 16,,Alexandria Stadium,Alexandria,10000.0
609,2019,6 July 2019 (2019-07-06),21:00,Egypt,South Africa,0.0,1.0,Round of 16,,Cairo International Stadium,Cairo,75000.0
610,2019,7 July 2019 (2019-07-07),18:00,Madagascar,DR Congo,2.0,2.0,Round of 16,Madagascar win on Penalities 4-2,Alexandria Stadium,Alexandria,5890.0


### 18. Iterating Over Rows in a DataFrame Efficiently

When working with a large dataset and you need to iterate over rows in a DataFrame, using `itertuples()` can be a more efficient approach compared to `iterrows()`.

`itertuples()` returns an iterator yielding namedtuples for each row in the DataFrame. Each namedtuple represents a row, and the attribute names correspond to the column names in the DataFrame.

To iterate over rows using `itertuples()`, you can follow these steps:

1. Define a loop to iterate over the rows in the DataFrame.
2. Use the `itertuples()` function on the DataFrame to obtain the iterator.
3. Iterate over the rows, accessing the data using attribute names.

Here's an example of how to use `itertuples()` to iterate over rows in a DataFrame:

```python
# Iterate over rows in the DataFrame
for row in df.itertuples(index=False):
    # Access row data using attribute names
    print(f"Column 1: {row.Column1}")
    print(f"Column 2: {row.Column2}")
    # Add more print statements for other columns if needed
    print("------")


In [6]:
# Iterate over rows in the DataFrame
for row in df3.itertuples(index=False):
    # Access row data using attribute names
    print(f"Home Team: {row.HomeTeam}")
    print(f"Away Team: {row.AwayTeam}")
    print(f"Home Team Goals: {row.HomeTeamGoals}")
    print(f"Away Team Goals: {row.AwayTeamGoals}")
    print("------")


Home Team: Sudan 
Away Team:  Egypt
Home Team Goals: 1.0
Away Team Goals: 2.0
------
Home Team: Ethiopia 
Away Team:  South Africa
Home Team Goals: nan
Away Team Goals: nan
------
Home Team: Egypt 
Away Team:  Ethiopia
Home Team Goals: 4.0
Away Team Goals: 0.0
------
Home Team: Egypt
Away Team:  Ethiopia
Home Team Goals: 4.0
Away Team Goals: 0.0
------
Home Team: Sudan 
Away Team:  Ethiopia
Home Team Goals: 1.0
Away Team Goals: 0.0
------
Home Team: Egypt
Away Team:  Sudan
Home Team Goals: 2.0
Away Team Goals: 1.0
------
Home Team: Ethiopia 
Away Team:  Tunisia
Home Team Goals: 4.0
Away Team Goals: 2.0
------
Home Team: Egypt
Away Team:  Uganda
Home Team Goals: 2.0
Away Team Goals: 1.0
------
Home Team: Tunisia 
Away Team:  Uganda
Home Team Goals: 3.0
Away Team Goals: 0.0
------
Home Team: Ethiopia 
Away Team: Egypt
Home Team Goals: 4.0
Away Team Goals: 2.0
------
Home Team: Ghana 
Away Team:  Tunisia
Home Team Goals: 1.0
Away Team Goals: 1.0
------
Home Team: Ghana 
Away Team:  Ethiop

### 18. Iterating Over Rows in a DataFrame using iterrows() method(Second Method)

To iterate over the rows in a DataFrame and access the row data, you can use the `iterrows()` function in Pandas. This allows you to perform operations on each row individually.

Here's an example of how to iterate over rows in a DataFrame:

```python
for index, row in df.iterrows():
    # Access row data using column names
    # Perform desired operations
    # ...


In [7]:
# Iterate over rows in the DataFrame
for index, row in df3.iterrows():
    # Access row data using column names
    print(f"Row index: {index}")
    print(f"Home Team: {row['HomeTeam']}")
    print(f"Away Team: {row['AwayTeam']}")
    print(f"Home Team Goals: {row['HomeTeamGoals']}")
    print(f"Away Team Goals: {row['AwayTeamGoals']}")
    print("------")


Row index: 0
Home Team: Sudan 
Away Team:  Egypt
Home Team Goals: 1.0
Away Team Goals: 2.0
------
Row index: 1
Home Team: Ethiopia 
Away Team:  South Africa
Home Team Goals: nan
Away Team Goals: nan
------
Row index: 2
Home Team: Egypt 
Away Team:  Ethiopia
Home Team Goals: 4.0
Away Team Goals: 0.0
------
Row index: 3
Home Team: Egypt
Away Team:  Ethiopia
Home Team Goals: 4.0
Away Team Goals: 0.0
------
Row index: 4
Home Team: Sudan 
Away Team:  Ethiopia
Home Team Goals: 1.0
Away Team Goals: 0.0
------
Row index: 5
Home Team: Egypt
Away Team:  Sudan
Home Team Goals: 2.0
Away Team Goals: 1.0
------
Row index: 6
Home Team: Ethiopia 
Away Team:  Tunisia
Home Team Goals: 4.0
Away Team Goals: 2.0
------
Row index: 7
Home Team: Egypt
Away Team:  Uganda
Home Team Goals: 2.0
Away Team Goals: 1.0
------
Row index: 8
Home Team: Tunisia 
Away Team:  Uganda
Home Team Goals: 3.0
Away Team Goals: 0.0
------
Row index: 9
Home Team: Ethiopia 
Away Team: Egypt
Home Team Goals: 4.0
Away Team Goals: 2.0


### 19. To change the new order of columns in the DataFrame `df3`, you can follow these steps:

1. Define the desired order of columns using a list. For example:
   ```python
   new_column_order = ['Stadium', 'City', 'Attendance', 'HomeTeam', 'AwayTeam', 'HomeTeamGoals', 'AwayTeamGoals', 'Stage', 'SpecialWinConditions']


In [10]:
# Define the desired order of columns
new_column_order = ['Stadium', 'City', 'Attendance', 'HomeTeam', 'AwayTeam', 'HomeTeamGoals', 'AwayTeamGoals', 'Stage', 'SpecialWinConditions']

# Reorder the columns in df3
df3 = df3[new_column_order]
# Print the new order of columns
print(df3.columns)

Index(['Stadium', 'City', 'Attendance', 'HomeTeam', 'AwayTeam',
       'HomeTeamGoals', 'AwayTeamGoals', 'Stage', 'SpecialWinConditions'],
      dtype='object')


### 20. Deleting Rows with 0 in 'HomeTeamGoals' Column

To delete rows from a DataFrame where the value is 0 in the 'HomeTeamGoals' column, you can use the following steps:

1. Identify the DataFrame: In this case, the DataFrame is named `df3`.

2. Delete rows: Use the pandas DataFrame indexing to filter out the rows where the value in the 'HomeTeamGoals' column is 0. Here's an example code snippet:

    ```python
    df3 = df3[df3['HomeTeamGoals'] != 0]
    ```

    This code creates a new DataFrame `df3` that contains only the rows where the 'HomeTeamGoals' column value is not equal to 0.

3. Print the updated DataFrame: After deleting the rows, you can print the updated DataFrame to verify the changes:

    ```python
    print(df3)
    ```

    This will display the updated DataFrame in the output.

In [11]:
# Delete rows with 0 in 'HomeTeamGoals' column
df3 = df3[df3['HomeTeamGoals'] != 0]

# Print the updated DataFrame
print(df3)


                   Stadium         City  Attendance     HomeTeam  \
0        Municipal Stadium     Khartoum     30000.0       Sudan    
1                      NaN          NaN         NaN    Ethiopia    
2        Municipal Stadium     Khartoum     30000.0       Egypt    
3    Prince Farouk Stadium        Cairo     30000.0        Egypt   
4    Prince Farouk Stadium        Cairo     20000.0       Sudan    
..                     ...          ...         ...          ...   
601       Ismailia Stadium     Ismailia      8094.0       Ghana    
606       Al Salam Stadium        Cairo      7500.0     Morocco    
608     Alexandria Stadium   Alexandria     10000.0     Nigeria    
610     Alexandria Stadium   Alexandria      5890.0  Madagascar    
611        30 June Stadium        Cairo      8205.0     Algeria    

          AwayTeam  HomeTeamGoals  AwayTeamGoals             Stage  \
0            Egypt            1.0            2.0        Semifinals   
1     South Africa            NaN          