# Project #1, Part 2: Data Exploration and Cleaning

# Yash Patel

# 2/6/24

### This notebook will take an input file ('Majors Survey Results - Fall 2021.csv') and clean the data to create a structure, condensed csv file with actionable data to answer the following questions.

### 1. Which events had students attended prior to coming to CCM?

### 2. Which were the greatest motivations to seek a computing degree/certificate at CCM (to get a job, transfer to bachelor's level program, etc.)?

### 3. What were the main sources from which they received information about CCM’s computing programs?

### 4. Which degree programs are the most popular (computer science, information technology, etc.)?

In [1]:
import pandas as pd

data = pd.read_csv('Majors Survey Results - Fall 2021.csv', encoding='utf-8')

## The cell above imported the input file into the ipynb and the cells below will do some preliminary data explorations using these functions: describe(), info(), columns, shape, dtypes, head(), tail(), and sample().

In [2]:
data.describe()

Unnamed: 0,"On a scale of 1 to 5, with 1 being not at all interested and 5 being extremely interested, how interested are you in taking more computing classes?"
count,68.0
mean,3.647059
std,1.003943
min,1.0
25%,3.0
50%,4.0
75%,4.0
max,5.0


In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 245 entries, 0 to 244
Data columns (total 90 columns):
 #   Column                                                                                                                                                                                                                                                           Non-Null Count  Dtype  
---  ------                                                                                                                                                                                                                                                           --------------  -----  
 0   Timestamp                                                                                                                                                                                                                                                        245 non-null    object 
 1   Which course are you enrolled in

In [4]:
data.columns

Index(['Timestamp', 'Which course are you enrolled in?',
       'How did you hear about County College of Morris? [CCM Web site]',
       'How did you hear about County College of Morris? [Social Media]',
       'How did you hear about County College of Morris? [Community Event]',
       'How did you hear about County College of Morris? [Family member or friend]',
       'How did you hear about County College of Morris? [Current CCM student]',
       'How did you hear about County College of Morris? [CCM Alumni]',
       'How did you hear about County College of Morris? [High School Teacher]',
       'How did you hear about County College of Morris? [High School Counselor]',
       'How did you hear about County College of Morris? [In-app advertisement]',
       'How did you hear about County College of Morris? [Employer]',
       'How did you hear about County College of Morris? [Billboard]',
       'How did you hear about County College of Morris? [Television]',
       'How did you h

In [5]:
data.tail()

Unnamed: 0,Timestamp,Which course are you enrolled in?,How did you hear about County College of Morris? [CCM Web site],How did you hear about County College of Morris? [Social Media],How did you hear about County College of Morris? [Community Event],How did you hear about County College of Morris? [Family member or friend],How did you hear about County College of Morris? [Current CCM student],How did you hear about County College of Morris? [CCM Alumni],How did you hear about County College of Morris? [High School Teacher],How did you hear about County College of Morris? [High School Counselor],...,Did you receive information about the CCM computing programs from any of the following sources? [Employer],Did you receive information about the CCM computing programs from any of the following sources? [CCM Workforce Development],Did you receive information about the CCM computing programs from any of the following sources? [NJ Workforce Development Program],Did you receive information about the CCM computing programs from any of the following sources? [Other],"Was a computing major/certificate your first choice, or did you change majors from a different CCM program? If you changed majors, indicate what your first major was.","On a scale of 1 to 5, with 1 being not at all interested and 5 being extremely interested, how interested are you in taking more computing classes?",Please explain your answer to the question above. Why or why not would you be interested in taking another computing class?,Gender,Race/ethnicity,Age
240,2021/10/22 5:07:51 PM EST,CMP 130 Intro to IT,Don't recall,Don't recall,Don't recall,Don't recall,Don't recall,Don't recall,Don't recall,Don't recall,...,No,No,No,No,First Choice,,,Man,White/Caucasian,35-64
241,2021/10/22 7:34:04 PM EST,CMP 131 Fundamentals of Programming (Python),No,No,Yes,Yes,Yes,Yes,Yes,No,...,No,No,No,Yes,First Choice,,,Man,White/Caucasian,18 and younger
242,2021/10/22 7:36:54 PM EST,CMP 131 Fundamentals of Programming (Python),Yes,Yes,No,Yes,No,Yes,No,No,...,No,No,No,Yes,First Choice,,,Woman,White/Caucasian,18 and younger
243,2021/12/17 10:13:34 AM EST,CMP 130 Intro to IT,Don't recall,Don't recall,Don't recall,Don't recall,Don't recall,Don't recall,Don't recall,Don't recall,...,No,No,No,No,First Choice,,,Man,White/Caucasian,35-64
244,2021/12/17 1:11:40 PM EST,CMP 128 Computer Science I,No,No,No,Don't recall,Don't recall,Don't recall,Yes,Don't recall,...,No,No,No,No,Business,,,Man,White/Caucasian,18 and younger


In [6]:
data.sample()

Unnamed: 0,Timestamp,Which course are you enrolled in?,How did you hear about County College of Morris? [CCM Web site],How did you hear about County College of Morris? [Social Media],How did you hear about County College of Morris? [Community Event],How did you hear about County College of Morris? [Family member or friend],How did you hear about County College of Morris? [Current CCM student],How did you hear about County College of Morris? [CCM Alumni],How did you hear about County College of Morris? [High School Teacher],How did you hear about County College of Morris? [High School Counselor],...,Did you receive information about the CCM computing programs from any of the following sources? [Employer],Did you receive information about the CCM computing programs from any of the following sources? [CCM Workforce Development],Did you receive information about the CCM computing programs from any of the following sources? [NJ Workforce Development Program],Did you receive information about the CCM computing programs from any of the following sources? [Other],"Was a computing major/certificate your first choice, or did you change majors from a different CCM program? If you changed majors, indicate what your first major was.","On a scale of 1 to 5, with 1 being not at all interested and 5 being extremely interested, how interested are you in taking more computing classes?",Please explain your answer to the question above. Why or why not would you be interested in taking another computing class?,Gender,Race/ethnicity,Age
7,2021/10/04 12:20:47 PM EST,CMP 128 Computer Science I,Yes,No,No,Yes,Yes,No,Yes,Yes,...,,,,,,3.0,I am not sure if computer science is a field t...,Woman,Asian,18 and younger


## The cells below will now perform data cleaning operations on the dataset using the drop() and rename() functions.

In [7]:
#Remove columns that are not relevant or redundant.
df1 = data.drop(columns = ['Timestamp','Which course are you enrolled in?', 'How did you hear about County College of Morris? [CCM Web site]', 'How did you hear about County College of Morris? [Social Media]', 'How did you hear about County College of Morris? [Community Event]', 'How did you hear about County College of Morris? [Family member or friend]', 'How did you hear about County College of Morris? [Current CCM student]', 'How did you hear about County College of Morris? [CCM Alumni]', 'How did you hear about County College of Morris? [High School Teacher]', 'How did you hear about County College of Morris? [High School Counselor]', 'How did you hear about County College of Morris? [In-app advertisement]', 'How did you hear about County College of Morris? [Employer]', 'How did you hear about County College of Morris? [Billboard]', 'How did you hear about County College of Morris? [Television]', 'How did you hear about County College of Morris? [Radio]', 'How did you hear about County College of Morris? [Other]', 'To what extent did the following impact your decision to attend County College of Morris? [Affordable cost]', 'To what extent did the following impact your decision to attend County College of Morris? [Location/convenience]', 'To what extent did the following impact your decision to attend County College of Morris? [Choice of programs]', 'To what extent did the following impact your decision to attend County College of Morris? [Online offerings]', 'To what extent did the following impact your decision to attend County College of Morris? [Family/friend referral]', 'To what extent did the following impact your decision to attend County College of Morris? [Faculty/staff]', 'To what extent did the following impact your decision to attend County College of Morris? [College reputation]', 'To what extent did the following impact your decision to attend County College of Morris? [Financial Aid]', 'To what extent did the following impact your decision to attend County College of Morris? [Scholarships]', 'To what extent did the following impact your decision to attend County College of Morris? [Small class sizes]', 'To what extent did the following impact your decision to attend County College of Morris? [Extra-curricular opportunities]', 'To what extent did the following impact your decision to attend County College of Morris? [Accepted my transfer credits]', 'To what extent did the following impact your decision to attend County College of Morris? [Negative experience at another college]', 'To what extent did the following impact your decision to attend County College of Morris? [NJ Stars Program]', 'To what extent did the following impact your decision to attend County College of Morris? [Ability to transfer CCM credits to a 4-year school]', 'To what extent did the following impact your decision to attend County College of Morris? [Get college credit while in high school]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Middle/High school computing class]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Middle/High school computing related club]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Computing-related competitions (e.g., Robotics competition, Lego competition, Cybersecurity, Programming)]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Afterschool computing-related camp/program]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Summer computing related camp/program]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [An AP computing class]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [A dual enrollment computing class]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Family or friend influence]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Family or friend working in the computing field]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [High school teacher or guidance counselor]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Employer influence]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Current CCM student]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Experience at work]', 'To what extent did the following activities or experience impact your decision to enroll in an computing course at CCM? [Other]', 'Was a computing major/certificate your first choice, or did you change majors from a different CCM program? If you changed majors, indicate what your first major was.', 'On a scale of 1 to 5, with 1 being not at all interested and 5 being extremely interested, how interested are you in taking more computing classes?', 'Please explain your answer to the question above.  Why or why not would you be interested in taking another computing class?', 'Gender', 'Race/ethnicity', 'Age'])
#Test if we only see the columns we need - should have 38 columns
df1.sample()

Unnamed: 0,"Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Open House]","Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Instant Decision Day]","Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [On-Campus Information Session]","Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Titans Tuesday (Virtual) Information Session]","Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Women Who Dare]","Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Regional College Fair]","Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [HS Sharetime Information Session]","Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Challenger Program]","Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [CyberSecurity Information Protection Program Participation ]","Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Information Session at my high school]",...,What motivated you to seek a computing degree/certificate at CCM? [Personal Enrichment],Did you receive information about the CCM computing programs from any of the following sources? [High school guidance counselor],Did you receive information about the CCM computing programs from any of the following sources? [High School Teacher],Did you receive information about the CCM computing programs from any of the following sources? [CCM Information Technologies Website],Did you receive information about the CCM computing programs from any of the following sources? [CCM Admissions],Did you receive information about the CCM computing programs from any of the following sources? [CCM advisor/counselor],Did you receive information about the CCM computing programs from any of the following sources? [Employer],Did you receive information about the CCM computing programs from any of the following sources? [CCM Workforce Development],Did you receive information about the CCM computing programs from any of the following sources? [NJ Workforce Development Program],Did you receive information about the CCM computing programs from any of the following sources? [Other]
52,Yes,No,Yes,No,No,No,No,No,No,No,...,Yes,No,No,Yes,No,Yes,No,No,No,Yes


In [10]:
#Rename the columns to a single, lower-case word. Use the underscore character to combine words, if needed.
df2 = df1.rename(columns = {'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Open House]	':'prior_participation_open_house', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Open House]':'prior_participation_open_house', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Instant Decision Day]':'prior_participation_instant_decision_day', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [On-Campus Information Session]':'prior_participation_on_campus_info_session', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Titans Tuesday (Virtual) Information Session]':'prior_participation_virtual_info_session', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Women Who Dare]':'prior_participation_women_who_dare', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Regional College Fair]':'prior_participation_regional_college_fair', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [HS Sharetime Information Session]':'prior_participation_hs_sharetime_info_session', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Challenger Program]':'prior_participation_challenger_program', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [CyberSecurity Information Protection Program Participation ]':'prior_participation_cybersec_program', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Information Session at my high school]':'prior_participation_hs_info_session', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Campus Visit with my high school]':'prior_participation_hs_campus_visit', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Campus Visit (individual)]':'prior_participation_individual_campus_visit', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Workforce Development class]':'prior_participation_workforce_development_class' , 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Corporate Training]':'prior_participation_corporate_training', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Teen Arts Festival]':'prior_participation_teen_arts_fest', 'Prior to applying to college, did you participate in any of the following events or activities at the County College of Morris and/or with the Department of Information Technologies, if at all? [Summer camp at CCM]':'prior_participation_CCM_summer_camp', 'What degree program are you currently enrolled in?':'current_degree', 'What motivated you to seek a computing degree/certificate at CCM?  [To get a job in the computing field]':'motivation_computing_job', "What motivated you to seek a computing degree/certificate at CCM?  [Transfer to bachelor's level program]":'motivation_transfer_bachelors', 'What motivated you to seek a computing degree/certificate at CCM?  [Transfer credits back to HS degree (ShareTime, Challenger Program)]':'motivation_hs_dual_enrollment', 'What motivated you to seek a computing degree/certificate at CCM?  [Career Advancement]':'motivation_career_advancement', 'What motivated you to seek a computing degree/certificate at CCM?  [Career Change]':'motivation_career_change', 'What motivated you to seek a computing degree/certificate at CCM?  [Professional Development]':'motivation_profession_development', 'What motivated you to seek a computing degree/certificate at CCM?  [Job Displacement]':'motivation_job_displacement', 'What motivated you to seek a computing degree/certificate at CCM?  [Relocation]':'motivation_relocation', 'What motivated you to seek a computing degree/certificate at CCM?  [To keep current in tech industry]':'motivation_keep_current_in_tech', 'What motivated you to seek a computing degree/certificate at CCM?  [IT Industry Certifications]':'motivation_IT_certification', 'What motivated you to seek a computing degree/certificate at CCM?  [Financial]':'motivation_financial', 'What motivated you to seek a computing degree/certificate at CCM?  [Personal Enrichment]':'motivation_personal_enrichment', 'Did you receive information about the CCM computing programs from any of the following sources? [High school guidance counselor]':'info_source_hs_guidance_counselor', 'Did you receive information about the CCM computing programs from any of the following sources? [High School Teacher]':'info_source_hs_teacher', 'Did you receive information about the CCM computing programs from any of the following sources? [CCM Information Technologies Website]':'info_source_ccm_website', 'Did you receive information about the CCM computing programs from any of the following sources? [CCM Admissions]':'info_source_ccm_admissions', 'Did you receive information about the CCM computing programs from any of the following sources? [CCM advisor/counselor]':'info_source_ccm_advisor', 'Did you receive information about the CCM computing programs from any of the following sources? [Employer]':'info_source_employer', 'Did you receive information about the CCM computing programs from any of the following sources? [CCM Workforce Development]':'info_source_workforce_development', 'Did you receive information about the CCM computing programs from any of the following sources? [NJ Workforce Development Program]':'info_source_NJ_workforce_development', 'Did you receive information about the CCM computing programs from any of the following sources? [Other]':'info_source_other'})
#Test
df2.head()

Unnamed: 0,prior_participation_open_house,prior_participation_instant_decision_day,prior_participation_on_campus_info_session,prior_participation_virtual_info_session,prior_participation_women_who_dare,prior_participation_regional_college_fair,prior_participation_hs_sharetime_info_session,prior_participation_challenger_program,prior_participation_cybersec_program,prior_participation_hs_info_session,...,motivation_personal_enrichment,info_source_hs_guidance_counselor,info_source_hs_teacher,info_source_ccm_website,info_source_ccm_admissions,info_source_ccm_advisor,info_source_employer,info_source_workforce_development,info_source_NJ_workforce_development,info_source_other
0,No,No,No,No,No,No,No,No,No,No,...,,,,,,,,,,
1,No,No,No,No,No,No,No,No,No,No,...,,,,,,,,,,
2,No,No,No,No,No,No,No,No,No,No,...,No,No,Yes,No,No,No,No,No,No,Yes
3,No,No,No,No,No,No,No,No,No,No,...,Yes,No,No,No,No,No,No,No,No,No
4,Yes,No,Yes,No,No,Yes,No,No,No,Yes,...,,,,,,,,,,


In [9]:
#The to_csv function will allow me to save the cleaned csv file locally. 
df2.to_csv("/Users/ytpatel3/Downloads/cleaned_data2.csv")