# Exploratory Data Analysis - ASUU Strike Effect Analysis Project

## Data Cleaning

While we attempted to reduce the amount of cleaning required, via good survey design, there are still some cases that would require cleaning. 


### Data Cleaning Cadence
1. Inspect data for structure and data types
2. Inspect for missing values
3. Check for duplicates in data
4. Check for outliers
5. Mislabeling and spurious data entries
6. Inspect data columns for data types

Import packages

In [110]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

Load data

In [111]:
data_filepath = "../data/strike_and_academic_performance.csv"

data = pd.read_csv(data_filepath)

In [112]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 431 entries, 0 to 430
Data columns (total 21 columns):
 #   Column                                                                              Non-Null Count  Dtype  
---  ------                                                                              --------------  -----  
 0   Are you a student of the University of Lagos?                                       431 non-null    object 
 1   If not, what is your university/institution?                                        22 non-null     object 
 2   What is your current academic level?                                                431 non-null    object 
 3   How old are you?                                                                    431 non-null    int64  
 4   What is your gender?                                                                431 non-null    object 
 5   What was your relationship status during the strike?                                431 non-null   

### Inspect data for structure and data type

In [113]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 431 entries, 0 to 430
Data columns (total 21 columns):
 #   Column                                                                              Non-Null Count  Dtype  
---  ------                                                                              --------------  -----  
 0   Are you a student of the University of Lagos?                                       431 non-null    object 
 1   If not, what is your university/institution?                                        22 non-null     object 
 2   What is your current academic level?                                                431 non-null    object 
 3   How old are you?                                                                    431 non-null    int64  
 4   What is your gender?                                                                431 non-null    object 
 5   What was your relationship status during the strike?                                431 non-null   

Are there any missing values?

In [114]:
data.isna().sum()

Are you a student of the University of Lagos?                                           0
If not, what is your university/institution?                                          409
What is your current academic level?                                                    0
How old are you?                                                                        0
What is your gender?                                                                    0
What was your relationship status during the strike?                                    0
What is your faculty?                                                                   0
What is your department?                                                                0
Kindly input your department, if not listed in the previous question.                 403
How has the ASUU strike affected you and your academic performance?                    86
What was the most challenging part of returning to academic life after the strike?     91
Did you un

### Rename Columns
The current column names are too long. Let's make them shorter for easier analysis.

In [115]:
# Rename multiple columns

column_mapping = {
    'Are you a student of the University of Lagos?': 'unilag',
    'If not, what is your university/institution?': 'non_unilag',
    'What is your current academic level?': 'level',
    'How old are you?': 'age',
    'What is your gender?': 'gender',
    'What was your relationship status during the strike?': 'relationship',
    'What is your faculty?': 'faculty',
    'What is your department?': 'department',
    'Kindly input your department, if not listed in the previous question. ': 'other_dept',
    'How has the ASUU strike affected you and your academic performance?': 'strike_effect',
    'What was the most challenging part of returning to academic life after the strike?': 'challenge',
    'Did you undertake any work during the strike?': 'work',
    'How did you develop yourself during the strike?': 'skills',
    'How prepared were you for the exams? [Before Strike]': 'prep_before',
    'How prepared were you for the exams? [After Strike]': 'prep_after',
    'How were your lectures affected by the strike?': 'lecture',
    'How often did you engage in academic activities during the strike?': 'academic_act',
    'How many courses did you take in the affected semester? ': 'courses_taken',
    'How many credit units did your courses add up to in the affected semester?': 'course_unit',
    'What was your CGPA before the strike?': 'cgpa_before',
    'What is your current CGPA?': 'cgpa_after'
}

data = data.rename(columns=column_mapping)

data.head(2)

Unnamed: 0,unilag,non_unilag,level,age,gender,relationship,faculty,department,other_dept,strike_effect,...,work,skills,prep_before,prep_after,lecture,academic_act,courses_taken,course_unit,cgpa_before,cgpa_after
0,Yes,,400 Level,22,Male,Single,Engineering,Chemical Engineering,,I learned how to study better and my grades al...,...,Worked in a role relevant to my studies,Acquired skills unrelated to course of study,Poorly,Poorly,No noticeable change,Rarely: I engaged in academic activities once ...,10,23.0,3.39,3.51
1,Yes,,400 Level,23,Female,Single,Engineering,Chemical Engineering,,It affected it in a negative way as it became ...,...,Did not work during the strike,Acquired skills unrelated to course of study,Poorly,Moderately,No noticeable change,Rarely: I engaged in academic activities once ...,10,23.0,4.44,4.5


### Creating our target column
Our target column in this analysis is the change in CGPA after the strike. We're trying to see if the strike had a positive or negative effect on the participants.

If CGPA increases after the strike, it may be possible that the strike had a positive effect on the participants. And vice versa. 

In [116]:
#Creating our outcome variable column

data['cgpa_change'] = data['cgpa_after'] - data['cgpa_before']

### Solving the department debacle

There are 2 columns for department `department` and `other_dept`, 
1. One contains the main department of individuals who had their department on the list while filling, and those that did not find theirs and had to specify
2. We have to find a way to merge them, as clearly one of the columns aren't needed

In [117]:
data.department.unique()

array(['Chemical Engineering', 'Political Science',
       'Computer Engineering', 'Educational Foundations', 'Statistics',
       'Geosciences', 'Science Tech. Education',
       'Petroleum & Gas Engineering', 'Cell Biology & Genetics',
       'Surveying & Geo-Informatics Engineering', 'Mathematics',
       'Finance', 'Marine Science',
       'Industrial Relations & Personnel Management',
       'Mechanical Engineering', 'Mass Communication',
       'Biomedical Engineering', 'Estate', 'Other',
       'Biochemistry (Basic Medical Sciences)', 'Law', 'Medicine',
       'Arts & Social Science Education', 'Zoology',
       'Biochemistry (Sciences)', 'Education Administration', 'Botany',
       'Economics', 'Systems Engineering', 'Psychology', 'Accounting',
       'Physics', 'Radiology', 'Electrical & Electronics Engineering',
       'Geography', 'Microbiology', 'Chemistry', 'Architecture',
       'Biology Education', 'Human Kinetics & Health Education',
       'Physiology', 'Adult Educatio

In [118]:
data.other_dept.unique()

array([nan, 'Geophysics ', 'Radiography ', 'Biology Education ',
       'FISHERIES ', 'Education and Biology ', 'Pharmacy ', 'Law ',
       'Early childhood education ', 'Education Eng',
       'Business Education ', 'Education ', 'Business Education',
       'Religious Studies ', 'Technology and vocational education ',
       'Communication and Language Arts ', 'Pharmacology ',
       'Pharmacology, therapeutics and toxicology ', 'PHARMACY ',
       'Pharmacy', 'Pharmacology', 'Mechatronics Engineering.',
       'Banking and Finance ', 'Insurance ', 'Education foundation ',
       'Art & Social Science Education '], dtype=object)

Replace "Other" in `department` with the corresponding value from `other_dept`

In [119]:
data['department'] = data['department'].str.lower()
data['other_dept'] = data['other_dept'].str.lower()


data['department'] = data.apply(lambda row: row['other_dept'] if row['department'] == 'other' else row['department'], axis=1).str.capitalize()


In [120]:
#Deal with any missing data in this column
data[data['department'].isna()]

Unnamed: 0,unilag,non_unilag,level,age,gender,relationship,faculty,department,other_dept,strike_effect,...,skills,prep_before,prep_after,lecture,academic_act,courses_taken,course_unit,cgpa_before,cgpa_after,cgpa_change
51,Yes,,200 Level,20,Male,Single,Pharmacy,,,,...,Acquired skills unrelated to course of study,Moderately,Moderately,Fewer lecturers attended classes,Often: I engaged in academic activities regula...,7,,5.0,4.89,-0.11
331,Yes,,300 Level,22,Female,Single,Social Sciences,,,I am mentally tired,...,"Volunteered for an event or organization, Acqu...",Moderately,Poorly,No noticeable change,Rarely: I engaged in academic activities once ...,7,18.0,0.0,0.0,0.0


There are two missing values here. We know that "Pharmacy is the most common value in `other_dept` for Pharmcy students. Hence, we can replace the NaN there with it. 

In [121]:
#For location 51
data.loc[51, 'department'] = 'Pharmacy'

However, we have no context here and would have to drop.

In [122]:
# Drop row with index 331 and reindex the DataFrame
data.drop(331, inplace=True)
data.reset_index(drop=True, inplace=True)

What are the most common departments?

In [123]:
data['department'].value_counts()

Cell biology & genetics       34
Chemical engineering          22
Accounting                    22
Educational foundations       19
Finance                       19
                              ..
Education and biology          1
Pharmacy                       1
Fisheries                      1
Biomedical engineering         1
Medical laboratory science     1
Name: department, Length: 76, dtype: int64

How many rows do we have in the data now?

In [124]:
len(data['department'])

430

Now let's drop the other_dept column.

In [125]:
data.drop(columns=['other_dept'], inplace=True)
data.head(2)

Unnamed: 0,unilag,non_unilag,level,age,gender,relationship,faculty,department,strike_effect,challenge,...,skills,prep_before,prep_after,lecture,academic_act,courses_taken,course_unit,cgpa_before,cgpa_after,cgpa_change
0,Yes,,400 Level,22,Male,Single,Engineering,Chemical engineering,I learned how to study better and my grades al...,Trying to remember things we were taught befor...,...,Acquired skills unrelated to course of study,Poorly,Poorly,No noticeable change,Rarely: I engaged in academic activities once ...,10,23.0,3.39,3.51,0.12
1,Yes,,400 Level,23,Female,Single,Engineering,Chemical engineering,It affected it in a negative way as it became ...,"Rekindling the student in me, lol. Trying to g...",...,Acquired skills unrelated to course of study,Poorly,Moderately,No noticeable change,Rarely: I engaged in academic activities once ...,10,23.0,4.44,4.5,0.06


### Non-UNILAG Students

We would like to extract a df of students that are not from the University of Lagos for future analysis

In [126]:
df_non_unilag = data[data['unilag'] == "No"] 

df_non_unilag.head()

Unnamed: 0,unilag,non_unilag,level,age,gender,relationship,faculty,department,strike_effect,challenge,...,skills,prep_before,prep_after,lecture,academic_act,courses_taken,course_unit,cgpa_before,cgpa_after,cgpa_change
32,No,Federal University of Petroleum Resources Effu...,300 Level,20,Female,Single,Engineering,Chemical engineering,No effect,Reading and going to class,...,None of the above,Very,Very,Fewer lecturers attended classes,Rarely: I engaged in academic activities once ...,11,22.0,4.67,4.67,0.0
50,No,OOU OGUN STATE,500 Level,23,Male,Single,AGRICULTURE,Fisheries,,,...,Acquired skills unrelated to course of study,Very,Moderately,Worse lectures after the strike,Rarely: I engaged in academic activities once ...,9,19.0,3.51,3.66,0.15
97,No,University of Ibadan,300 Level,22,Male,Single,Social Sciences,Economics,Reduced enthusiasm for academic related activi...,Refocusing,...,Acquired skills relevant to course of study,Moderately,Moderately,Worse lectures after the strike,Never: I did not engage in any academic activi...,10,30.0,2.8,2.7,-0.1
132,No,Alex Ekwueme Federal University Ndufu-Alike Ik...,300 Level,23,Female,Single,Basic Medical Sciences,Physiology,"It made me slack,I don't know what I'm doing i...",Studying school books,...,Volunteered for an event or organization,Poorly,Moderately,Fewer lecturers attended classes,Rarely: I engaged in academic activities once ...,9,21.0,2.95,0.0,-2.95
135,No,University of ibadan,Masters Program,26,Female,Single,Basic Medical Sciences,Microbiology,"Not really, it was just more difficult than ea...",Going back to lecture room was quite difficult...,...,Acquired skills unrelated to course of study,Very,Moderately,No noticeable change,Never: I did not engage in any academic activi...,9,3.0,0.0,4.01,4.01


Students without a CGPA won't help our analysis much.

In [127]:
no_cgpa = df_non_unilag[(df_non_unilag["cgpa_before"] == 0)|(df_non_unilag["cgpa_after"] == 0)]

df_non_unilag = df_non_unilag.drop(no_cgpa.index)

In [128]:
df_non_unilag.describe()

Unnamed: 0,age,courses_taken,course_unit,cgpa_before,cgpa_after,cgpa_change
count,14.0,14.0,13.0,14.0,14.0,14.0
mean,21.642857,8.928571,20.307692,3.627857,3.586429,-0.041429
std,1.446861,1.899971,7.57611,0.694098,0.69286,0.453022
min,20.0,6.0,0.0,2.45,2.0,-1.0
25%,21.0,7.25,18.0,3.1,3.2125,-0.275
50%,21.0,9.0,22.0,3.615,3.685,-0.09
75%,22.75,10.0,23.0,4.0,3.8975,0.1125
max,25.0,12.0,30.0,4.67,4.67,0.9


### UNILAG students

Now we may address the bulk of our problem. The main population for this study consists of UNILAG students. Let's start by subsetting UNIALG students. 

In [129]:
df_unilag = data[data['unilag'] != "No"] 

# drop non_unilag column
df_unilag = df_unilag.drop("non_unilag", axis=1)
 
df_unilag.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 413 entries, 0 to 429
Data columns (total 20 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   unilag         413 non-null    object 
 1   level          413 non-null    object 
 2   age            413 non-null    int64  
 3   gender         413 non-null    object 
 4   relationship   413 non-null    object 
 5   faculty        413 non-null    object 
 6   department     413 non-null    object 
 7   strike_effect  332 non-null    object 
 8   challenge      327 non-null    object 
 9   work           413 non-null    object 
 10  skills         413 non-null    object 
 11  prep_before    413 non-null    object 
 12  prep_after     413 non-null    object 
 13  lecture        413 non-null    object 
 14  academic_act   413 non-null    object 
 15  courses_taken  413 non-null    int64  
 16  course_unit    337 non-null    float64
 17  cgpa_before    413 non-null    float64
 18  cgpa_after

#### Addressing cases of No CGPA

Since the 2022 ASUU strike occured in the first semester, we expect that newly admitted students won't have a CGPA before the strike. In such cases, the participants were told to input "0".

In [130]:
#Extract a dataframe of individuals who had no cgpa before but cgpa after

no_cgpa_before = df_unilag[df_unilag['cgpa_before'] == 0]

no_cgpa_before.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 107 entries, 4 to 428
Data columns (total 20 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   unilag         107 non-null    object 
 1   level          107 non-null    object 
 2   age            107 non-null    int64  
 3   gender         107 non-null    object 
 4   relationship   107 non-null    object 
 5   faculty        107 non-null    object 
 6   department     107 non-null    object 
 7   strike_effect  82 non-null     object 
 8   challenge      80 non-null     object 
 9   work           107 non-null    object 
 10  skills         107 non-null    object 
 11  prep_before    107 non-null    object 
 12  prep_after     107 non-null    object 
 13  lecture        107 non-null    object 
 14  academic_act   107 non-null    object 
 15  courses_taken  107 non-null    int64  
 16  course_unit    92 non-null     float64
 17  cgpa_before    107 non-null    float64
 18  cgpa_after

#### No CGPA After? 
At the time of this survey, some students were yet to see their results. Hence, they filled in 0 in the CGPA After column. While their comments may be useful, we can't use the value they inputed in analysis. Therefore, we have to address this.

What proportion of students in the UNILAG dataframe are victims of this?

In [131]:
percent_affected = len(df_unilag[df_unilag['cgpa_after'] == 0])/len(df_unilag) * 100

f"About {round(percent_affected)}% of UNILAG Students were affected"

'About 9% of UNILAG Students were affected'

How many students have complete info?

In [132]:
len(df_unilag[(df_unilag['cgpa_before'] != 0.00) & (df_unilag['cgpa_after'] != 0)])

300

#### Valid CGPA

Now we have a dataframe of students with valid CGPAs. That is, neither CGPA before nor CGPA after = 0.

In [133]:
df_valid_cgpa = df_unilag[(df_unilag['cgpa_before'] != 0.00) & (df_unilag['cgpa_after'] != 0)]

In [134]:
#Checking data information
df_valid_cgpa.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 300 entries, 0 to 429
Data columns (total 20 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   unilag         300 non-null    object 
 1   level          300 non-null    object 
 2   age            300 non-null    int64  
 3   gender         300 non-null    object 
 4   relationship   300 non-null    object 
 5   faculty        300 non-null    object 
 6   department     300 non-null    object 
 7   strike_effect  244 non-null    object 
 8   challenge      241 non-null    object 
 9   work           300 non-null    object 
 10  skills         300 non-null    object 
 11  prep_before    300 non-null    object 
 12  prep_after     300 non-null    object 
 13  lecture        300 non-null    object 
 14  academic_act   300 non-null    object 
 15  courses_taken  300 non-null    int64  
 16  course_unit    239 non-null    float64
 17  cgpa_before    300 non-null    float64
 18  cgpa_after

Checking missing values for each column, once more

In [135]:
df_valid_cgpa.isna().sum()

unilag            0
level             0
age               0
gender            0
relationship      0
faculty           0
department        0
strike_effect    56
challenge        59
work              0
skills            0
prep_before       0
prep_after        0
lecture           0
academic_act      0
courses_taken     0
course_unit      61
cgpa_before       0
cgpa_after        0
cgpa_change       0
dtype: int64

Lots of missing values in the `strike_effect` and `challenge` columns. We'll deal with that later.

#### Quest for Duplicates

Are there any duplicates?

In [136]:
#Search for duplicates

df_valid_cgpa[df_valid_cgpa.duplicated()]

Unnamed: 0,unilag,level,age,gender,relationship,faculty,department,strike_effect,challenge,work,skills,prep_before,prep_after,lecture,academic_act,courses_taken,course_unit,cgpa_before,cgpa_after,cgpa_change
160,Yes,400 Level,22,Female,Single,Education,Arts & social science education,I just want to end all this..🥲,Having to return back to reading books and att...,Worked in a role unrelated to my studies,"Acquired skills unrelated to course of study, ...",Moderately,Poorly,No noticeable change,Never: I did not engage in any academic activi...,8,16.0,3.69,2.34,-1.35


In [137]:
df_valid_cgpa = df_valid_cgpa.drop_duplicates(keep='first')
df_valid_cgpa.reset_index(drop=True, inplace = True)

#### Missing values in the course_unit column?

It is clear that 75 missing values in a column is wild.

The safest option is to create a simple model to predict the values

In [138]:
col_list = df_unilag.columns.tolist()

print(col_list, end = '')

['unilag', 'level', 'age', 'gender', 'relationship', 'faculty', 'department', 'strike_effect', 'challenge', 'work', 'skills', 'prep_before', 'prep_after', 'lecture', 'academic_act', 'courses_taken', 'course_unit', 'cgpa_before', 'cgpa_after', 'cgpa_change']

Select the relevant courses to predict `course_units`. These are dependent on `level`, `faculty`, `department`, and `course_taken` .

In [139]:
#select relevant columns

missing_course_units  = df_valid_cgpa[["level", "faculty", "department", "courses_taken","course_unit"]]
missing_course_units.head(2)

Unnamed: 0,level,faculty,department,courses_taken,course_unit
0,400 Level,Engineering,Chemical engineering,10,23.0
1,400 Level,Engineering,Chemical engineering,10,23.0


In [140]:
len(missing_course_units)

299

In [148]:
test_data = (missing_course_units[missing_course_units['course_unit'].notnull() & (missing_course_units['course_unit'] > 0)]
             .reset_index(drop=True))

test_data.head()

Unnamed: 0,level,faculty,department,courses_taken,course_unit
0,400 Level,Engineering,Chemical engineering,10,23.0
1,400 Level,Engineering,Chemical engineering,10,23.0
2,400 Level,Engineering,Chemical engineering,10,23.0
3,300 Level,Education,Educational foundations,9,18.0
4,200 Level,Sciences,Statistics,5,15.0


In [141]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

# Step 1: Prepare the data

test_data = (missing_course_units[missing_course_units['course_unit'].notnull() & (missing_course_units['course_unit'] > 0)]
             .reset_index(drop=True))

# Create a DataFrame with missing values for 'course unit per semester' (training data)
training_data = missing_course_units[missing_course_units['course_unit'] == 0].reset_index(drop=True)


# Step 2: Split the data into features (X) and target (y)

X_test = test_data.drop(columns=['course_unit'])
y_test = test_data['course_unit']

In [143]:
training_data

Unnamed: 0,level,faculty,department,courses_taken,course_unit
0,200 Level,Management Science,Finance,7,0.0
1,300 Level,Education,Educational foundations,8,0.0
2,300 Level,Education,Educational foundations,9,0.0
3,200 Level,Management Science,Accounting,8,0.0
4,300 Level,Sciences,Biochemistry (sciences),7,0.0
5,400 Level,Education,Art & social science education,4,0.0


In [144]:
#Define the preprocessor with OneHotEncoder for categorical columns

from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
# Define the columns that need to be encoded (categorical columns)
categorical_cols = ['Unilag', 'Academic_Level', 'Faculty',
                    'Department']

# Define the preprocessor with OneHotEncoder for categorical columns and SimpleImputer for missing values

encoder = OneHotEncoder(handle_unknown='ignore')


preprocessor = ColumnTransformer(
    transformers=[
        ('cat', encoder, categorical_cols),
    ],
    remainder='passthrough'  # Place remainder parameter outside the transformers list
)


In [145]:
# Step 4: Fit and transform the preprocessor on the test data
X_test_processed = preprocessor.fit_transform(X_test)

# Step 5: Train a model on the test data
model = LinearRegression()
model.fit(X_test_processed, y_test)

# Step 6: Use the trained model to predict the missing values in the training data
X_train = training_data.drop(columns=['Course_Unit'])
X_train_processed = preprocessor.transform(X_train)
predicted_values = model.predict(X_train_processed)

# Step 7: Assign the predicted values to the training data
training_data['Course_Unit'] = np.ceil(predicted_values)

ValueError: A given column is not a column of the dataframe

In [None]:
training_data

Unnamed: 0,Unilag,Academic_Level,Faculty,Department,Courses_Taken,Course_Unit
0,Yes,400 Level,Social Sciences,Political science,7,12.0
1,Yes,500 Level,Engineering,Surveying & geo-informatics engineering,10,22.0
2,Yes,400 Level,Engineering,Chemical engineering,10,23.0
3,Yes,300 Level,Education,Educational foundations,10,17.0
4,Yes,300 Level,Social Sciences,Mass communication,9,23.0
...,...,...,...,...,...,...
61,Yes,200 Level,Education,Educational foundations,9,15.0
62,Yes,200 Level,Management Science,Accounting,8,19.0
63,Yes,200 Level,Arts,Philosophy,9,19.0
64,Yes,400 Level,Education,Art & social science education,4,12.0


In [None]:
#create a dataframe containing values from the missing_course_units.df that were not used as a training data set

remaining_data = missing_course_units[~missing_course_units.index.isin(training_data.index)]

In [None]:
# Concatenate training_data and remaining_data to get the complete dataset
df_course_units = pd.concat([training_data, remaining_data])

len(df_course_units)

299

In [None]:
#We still have NaN values in our column
#Drawing a df out that holds any missing values in a column


df_course_units[pd.isna(df_course_units['Course_Unit'])]

Unnamed: 0,Unilag,Academic_Level,Faculty,Department,Courses_Taken,Course_Unit


In [None]:
len(df_course_units)

299

##### Well?

In [None]:
removed_columns = columns_to_drop

# Add the removed columns back to the new DataFrame 'new_df' in place
new_df_unilag= pd.concat([df_course_units, df_unilag[removed_columns]], axis=1)

# Now 'new_df' will have the removed columns added back in place

In [None]:
new_df_unilag.head(10)

Unnamed: 0,Unilag,Academic_Level,Faculty,Department,Courses_Taken,Course_Unit,Strike_Effect,Challenging_Part,Job_Undertaken,Self_Development,Exam_Prep_Before_Strike,Exam_Prep_After_Strike,Lectures_Affected,Academic_Activities,Relationship_Status,Age,Gender,CGPA_Before,CGPA_After,CGPA Change
0,Yes,400 Level,Social Sciences,Political science,7,12.0,I learned how to study better and my grades al...,Trying to remember things we were taught befor...,Worked in a role relevant to my studies,Acquired skills unrelated to course of study,Poorly,Poorly,No noticeable change,Rarely: I engaged in academic activities once ...,Single,22,Male,3.39,3.51,0.12
1,Yes,500 Level,Engineering,Surveying & geo-informatics engineering,10,22.0,It affected it in a negative way as it became ...,"Rekindling the student in me, lol. Trying to g...",Did not work during the strike,Acquired skills unrelated to course of study,Poorly,Moderately,No noticeable change,Rarely: I engaged in academic activities once ...,Single,23,Female,4.44,4.5,0.06
2,Yes,400 Level,Engineering,Chemical engineering,10,23.0,It has actually helped me a bit. The extended ...,Readapting to school,Worked in a role unrelated to my studies,"Volunteered for an event or organization, Acqu...",Moderately,Moderately,Fewer lecturers attended classes,Rarely: I engaged in academic activities once ...,Dating,21,Male,3.54,3.61,0.07
3,Yes,300 Level,Education,Educational foundations,10,17.0,Good,Reading,Worked in a role unrelated to my studies,Acquired skills unrelated to course of study,Moderately,Very,No noticeable change,Rarely: I engaged in academic activities once ...,Dating,29,Male,3.86,3.96,0.1
4,Yes,300 Level,Social Sciences,Mass communication,9,23.0,Reluctance to concentrate on my studies,Getting to dust my books and and assimilate,Did not work during the strike,None of the above,Poorly,Very,Worse lectures after the strike,Never: I did not engage in any academic activi...,Single,24,Male,2.85,2.0,-0.85
5,Yes,400 Level,Management Science,Finance,6,17.0,No effect,No challenge,Worked in a role unrelated to my studies,Acquired skills unrelated to course of study,Moderately,Moderately,No noticeable change,Rarely: I engaged in academic activities once ...,Single,20,Female,4.0,3.8,-0.2
6,Yes,300 Level,Sciences,Botany,6,15.0,My grades dropped by a large margin,I couldn't face my books squarely,Worked in a role unrelated to my studies,Volunteered for an event or organization,Very,Poorly,Worse lectures after the strike,Never: I did not engage in any academic activi...,Single,22,Female,3.66,3.5,-0.16
7,Yes,300 Level,Education,Science tech. education,7,18.0,"The strike helped me with more time to study, ...",Finding my books and getting back into study m...,Worked in a role relevant to my studies,"Acquired skills relevant to course of study, A...",Moderately,Moderately,No noticeable change,Rarely: I engaged in academic activities once ...,Single,21,Female,3.06,3.26,0.2
8,Yes,200 Level,Management Science,Finance,7,19.0,Poorly,Going back to assignments and exams,Worked in a role unrelated to my studies,Vocational training and artisanship,Moderately,Poorly,Fewer lecturers attended classes,Rarely: I engaged in academic activities once ...,Dating,23,Female,3.5,3.49,-0.01
9,Yes,300 Level,Education,Educational foundations,8,14.0,Still the same,Prolonged graduating date,Worked in a role unrelated to my studies,Acquired skills unrelated to course of study,Moderately,Moderately,Worse lectures after the strike,Often: I engaged in academic activities regula...,Single,25,Male,3.42,3.72,0.3


In [None]:
# Define the desired column order
desired_order = ['Unilag', 'Academic_Level', 'Age', 'Gender', 'Relationship_Status', 'Faculty',
                 'Department', 'Strike_Effect', 'Challenging_Part', 'Job_Undertaken', 'Self_Development',
                 'Exam_Prep_Before_Strike', 'Exam_Prep_After_Strike', 'Lectures_Affected',
                 'Academic_Activities', 'Courses_Taken', 'Course_Unit', 'cgpa_before', 'CGPA_After', 'CGPA Change']

# Rearrange the columns of 'new_df' in the desired order
new_df_unilag = new_df_unilag.reindex(columns=desired_order)

In [None]:
new_df_unilag.head(2)

Unnamed: 0,Unilag,Academic_Level,Age,Gender,Relationship_Status,Faculty,Department,Strike_Effect,Challenging_Part,Job_Undertaken,Self_Development,Exam_Prep_Before_Strike,Exam_Prep_After_Strike,Lectures_Affected,Academic_Activities,Courses_Taken,Course_Unit,CGPA_Before,CGPA_After,CGPA Change
0,Yes,400 Level,22,Male,Single,Social Sciences,Political science,I learned how to study better and my grades al...,Trying to remember things we were taught befor...,Worked in a role relevant to my studies,Acquired skills unrelated to course of study,Poorly,Poorly,No noticeable change,Rarely: I engaged in academic activities once ...,7,12.0,3.39,3.51,0.12
1,Yes,500 Level,23,Female,Single,Engineering,Surveying & geo-informatics engineering,It affected it in a negative way as it became ...,"Rekindling the student in me, lol. Trying to g...",Did not work during the strike,Acquired skills unrelated to course of study,Poorly,Moderately,No noticeable change,Rarely: I engaged in academic activities once ...,10,22.0,4.44,4.5,0.06
