# Dataframes with Pandas - Students Health and Academic Performance

This dataset explores the relationship between students' health and their academic performance. It contains multiple rows of data, each representing a student, and multiple columns.  This dataset can be used to analyze the impact of health on academic success, identify potential predictors of academic performance, and inform interventions to support students' overall well-being and academic achievement.

In [2]:
import pandas as pd
import numpy as np

In [3]:
android = pd.read_csv('/Users/timothypark/Documents/portfolios/timpark99.github.io/Dataframes in Pandas/Android.csv')

# inspect the first 5 rows
android.head()

Unnamed: 0,Names,Age,Gender,Mobile Phone,Mobile Operating System,Mobile phone use for education,Mobile phone activities,Helpful for studying,Educational Apps,Daily usages,Performance impact,Usage distraction,Attention span,Useful features,Health Risks,Beneficial subject,Usage symptoms,Symptom frequency,Health precautions,Health rating
0,Ali,21-25,Male,Yes,Android,Sometimes,Social Media,Yes,Educational Videos,4-6 hours,Agree,During Exams,Yes,Camera,Yes,Accounting,Headache,Never,Using Blue light filter,Excellent
1,Bilal,21-25,Male,Yes,Android,Sometimes,Social Media,Yes,Educational Videos,4-6 hours,Neutral,During Exams,Yes,Notes Taking App,Yes,Browsing Material,All of these,Sometimes,Taking Break during prolonged use,Good
2,Abdullah,21-25,Male,Yes,Android,Frequently,All of these,Yes,Educational Videos,2-4 hours,Strongly agree,During Class Lectures,No,Internet Access,Only Partially,Reasarch,,Never,Limiting Screen Time,Excellent
3,Aammar,21-25,Male,Yes,Android,Rarely,All of these,Yes,Educational Videos,> 6 hours,Neutral,Not Distracting,Yes,Internet Access,Only Partially,Reasarch,Headache,Sometimes,None of Above,Good
4,Jehanzaib,21-25,Male,Yes,Android,Rarely,All of these,Yes,Educational Videos,2-4 hours,Strongly disagree,While Studying,Yes,Camera,Only Partially,Reasarch,Headache,Frequently,None of Above,Excellent


In [4]:
# the info method examines the number of rows and columns, the column names, the data type contained in each column, the number of non-null values in each column, and the amount of memory the dataframe uses.
android.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 81 entries, 0 to 80
Data columns (total 20 columns):
 #   Column                          Non-Null Count  Dtype 
---  ------                          --------------  ----- 
 0   Names                           81 non-null     object
 1   Age                             81 non-null     object
 2   Gender                          81 non-null     object
 3   Mobile Phone                    81 non-null     object
 4   Mobile Operating System         81 non-null     object
 5   Mobile phone use for education  79 non-null     object
 6   Mobile phone activities         80 non-null     object
 7   Helpful for studying            79 non-null     object
 8   Educational Apps                79 non-null     object
 9   Daily usages                    80 non-null     object
 10  Performance impact              79 non-null     object
 11  Usage distraction               79 non-null     object
 12  Attention span                  80 non-null     obje

In [5]:
# summary statistics but no numerical values in data
android.describe()

Unnamed: 0,Names,Age,Gender,Mobile Phone,Mobile Operating System,Mobile phone use for education,Mobile phone activities,Helpful for studying,Educational Apps,Daily usages,Performance impact,Usage distraction,Attention span,Useful features,Health Risks,Beneficial subject,Usage symptoms,Symptom frequency,Health precautions,Health rating
count,81,81,81,81,81,79,80,79,79,80,79,79,80,79,79,80,79,80,80,80
unique,81,4,2,1,1,4,7,2,4,4,5,4,2,4,3,3,6,4,4,7
top,Ali,21-25,Male,Yes,Android,Sometimes,All of these,Yes,Educational Videos,4-6 hours,Agree,While Studying,Yes,Internet Access,Yes,Reasarch,All of these,Sometimes,Limiting Screen Time,Good
freq,1,59,64,81,81,41,47,75,43,32,32,26,55,55,49,39,26,42,30,35


In [6]:
# check how many rows there are for each symptom
android['Usage symptoms'].value_counts()

Usage symptoms
All of these                                                 26
Sleep disturbance                                            22
Headache                                                     20
Anxiety or Stress                                             8
Sleep disturbance;Anxiety or Stress                           2
Headache;Sleep disturbance;Anxiety or Stress;All of these     1
Name: count, dtype: int64

In [7]:
# sort by age from oldest to youngest
age_sorted = android.sort_values(by='Age', ascending=False)
age_sorted.head()

Unnamed: 0,Names,Age,Gender,Mobile Phone,Mobile Operating System,Mobile phone use for education,Mobile phone activities,Helpful for studying,Educational Apps,Daily usages,Performance impact,Usage distraction,Attention span,Useful features,Health Risks,Beneficial subject,Usage symptoms,Symptom frequency,Health precautions,Health rating
59,Jamshid,31-35,Male,Yes,Android,Rarely,Web-browsing,Yes,Language,2-4 hours,,During Exams,Yes,Notes Taking App,No,Accounting,Sleep disturbance,Rarely,Taking Break during prolonged use,Fair
72,Sayed,31-35,Male,Yes,Android,Rarely,Messaging,,Productivity Tools,> 6 hours,Strongly disagree,Not Distracting,Yes,Notes Taking App,No,Accounting,All of these,Frequently,Using Blue light filter,Fair
27,Khawir,31-35,Male,Yes,Android,Never,All of these,No,Study Planner,< 2 hours,Neutral,Not Distracting,No,Calculator,Only Partially,Accounting,All of these,Never,None of Above,Excellent
58,Jawed,26-30,Male,Yes,Android,Rarely,Web-browsing,Yes,Language,> 6 hours,Agree,During Exams,No,Internet Access,Yes,Reasarch,Anxiety or Stress,Rarely,Limiting Screen Time,Good
45,Sabir,26-30,Male,Yes,Android,Sometimes,All of these,Yes,Productivity Tools,4-6 hours,Agree,While Studying,Yes,Internet Access,Only Partially,Browsing Material,Sleep disturbance,Sometimes,Taking Break during prolonged use,Good


In [8]:
# using iloc to select 2 rows at indices 10 and 11
age_sorted.iloc[10:12]

Unnamed: 0,Names,Age,Gender,Mobile Phone,Mobile Operating System,Mobile phone use for education,Mobile phone activities,Helpful for studying,Educational Apps,Daily usages,Performance impact,Usage distraction,Attention span,Useful features,Health Risks,Beneficial subject,Usage symptoms,Symptom frequency,Health precautions,Health rating
44,Kausar,21-25,Female,Yes,Android,Frequently,All of these,Yes,Productivity Tools,2-4 hours,Strongly agree,Not Distracting,No,Internet Access,Yes,Reasarch,All of these,Frequently,Limiting Screen Time,Good
56,Usman,21-25,Male,Yes,Android,Sometimes,Web-browsing,Yes,Educational Videos,2-4 hours,Neutral,During Class Lectures,Yes,Camera,Yes,Accounting,Anxiety or Stress,Rarely,Taking Break during prolonged use,Fair


In [9]:
# create a boolean mask that only shows data when "Health precautions" is "Limiting Screen Time"
mask = age_sorted['Health precautions'] == 'Limiting Screen Time'
limit_df = age_sorted[mask]
limit_df.head()

Unnamed: 0,Names,Age,Gender,Mobile Phone,Mobile Operating System,Mobile phone use for education,Mobile phone activities,Helpful for studying,Educational Apps,Daily usages,Performance impact,Usage distraction,Attention span,Useful features,Health Risks,Beneficial subject,Usage symptoms,Symptom frequency,Health precautions,Health rating
58,Jawed,26-30,Male,Yes,Android,Rarely,Web-browsing,Yes,Language,> 6 hours,Agree,During Exams,No,Internet Access,Yes,Reasarch,Anxiety or Stress,Rarely,Limiting Screen Time,Good
34,Ameer,26-30,Male,Yes,Android,Sometimes,Social Media,Yes,Educational Videos,4-6 hours,Agree,During Class Lectures,No,Notes Taking App,Yes,Accounting,All of these,Sometimes,Limiting Screen Time,Good
44,Kausar,21-25,Female,Yes,Android,Frequently,All of these,Yes,Productivity Tools,2-4 hours,Strongly agree,Not Distracting,No,Internet Access,Yes,Reasarch,All of these,Frequently,Limiting Screen Time,Good
52,Tufail,21-25,Male,Yes,Android,Never,Social Media,Yes,Language,2-4 hours,Neutral,During Class Lectures,No,Calculator,Yes,Reasarch,All of these,Never,Limiting Screen Time,Excellent;Good
47,Humara,21-25,Female,Yes,Android,Sometimes,All of these,Yes,Productivity Tools,2-4 hours,Strongly agree,While Studying,Yes,Camera,Only Partially,Browsing Material,Headache,Sometimes,Limiting Screen Time,Fair


In [10]:
# first number is the number of rows and the second is the number of columns
limit_df.shape

(30, 20)

In [11]:
# calculate mode of usage symptoms when health rating is excellent
mask = limit_df['Health rating'] == 'Excellent'
limit_df[mask]['Usage symptoms'].mode()

0    Headache
Name: Usage symptoms, dtype: object

In [30]:
# initially encountered an error because there was a white space in the CSV file
# perform strip to remove it
android.columns = android.columns.str.strip()

# returns a random sample of items from each group
android.groupby('Gender').sample()[['Health rating']]

Unnamed: 0,Health rating
19,Good
4,Excellent


In [32]:
# add another dataframe with the same format as the original dataframe with the same columns in the same order
ios = pd.read_csv('/Users/timothypark/Documents/portfolios/timpark99.github.io/Dataframes in Pandas/iOS.csv')
ios.head()

Unnamed: 0,Names,Age,Gender,Mobile Phone,Mobile Operating System,Mobile phone use for education,Mobile phone activities,Helpful for studying,Educational Apps,Daily usages,Performance impact,Usage distraction,Attention span,Useful features,Health Risks,Beneficial subject,Usage symptoms,Symptom frequency,Health precautions,Health rating
0,Hammad,21-25,Male,Yes,IOS,Sometimes,All of these,Yes,Educational Videos,4-6 hours,Strongly agree,Not Distracting,No,Camera,Yes,Browsing Material,All of these,Sometimes,None of Above,Excellent
1,Waqar,21-25,Male,Yes,IOS,Frequently,All of these,Yes,Educational Videos,> 6 hours,Agree,While Studying,Yes,Internet Access,No,Browsing Material,Sleep disturbance,Sometimes,None of Above,Excellent
2,Fatima,21-25,Female,Yes,IOS,Sometimes,All of these,Yes,Study Planner,4-6 hours,Agree,Not Distracting,Yes,Internet Access,No,Reasarch,Sleep disturbance,Sometimes,None of Above,Good
3,Wasid,21-25,Male,Yes,IOS,Sometimes,Social Media,Yes,Educational Videos,> 6 hours,Neutral,During Class Lectures,Yes,Internet Access,Yes,Accounting,Anxiety or Stress,Sometimes,Taking Break during prolonged use,Excellent
4,Mukhtar,21-25,Male,Yes,IOS,Sometimes,All of these,Yes,Language,4-6 hours,Neutral,While Studying,Yes,Internet Access,No,Browsing Material,Sleep disturbance;Anxiety or Stress,Rarely,None of Above,Good


In [34]:
# add the new dataframe as rows beneath the old dataframe to concatenate them
# assign the result to a new combined dataframe
# verify that the length of the combined dataframe is equal to the sum of lengths of ios and android dataframes

combined_df = pd.concat([android, ios], axis=0)
len(combined_df) == len(android) + len(ios)

True