### College Track - Data Analyst Skills Assessment

Here is a look at the code that I used to analyze the data.  This is the organized, curated version of my original Jupyter notebook when I was experimenting with what I wanted to do.

In [None]:
#Import packages and load excel into DataFrame
import pandas as pd
from operator import itemgetter
#Tried using index_col=None to ignore first column because of duplicates.  Found out it is a currently unsolved issue in pd.read_excel
df = pd.read_excel('Skill Assessment Data.xlsx')

#Print more rows to see entirety of DataFrame
pd.options.display.max_rows = 100

In [2]:
df.shape

(49, 69)

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 49 entries, Student to Student
Data columns (total 69 columns):
My friends want me to succeed.                                                                                      49 non-null object
My best friends model responsible behavior.                                                                         49 non-null object
Other people pick on and bully me.                                                                                  49 non-null object
I am confident in new situations.                                                                                   49 non-null object
My friends treat me with respect and kindness                                                                       49 non-null object
I am good at making and keeping friends.                                                                            49 non-null object
I feel safe at school.                                                           

In [4]:
df.head(10)

Unnamed: 0,My friends want me to succeed.,My best friends model responsible behavior.,Other people pick on and bully me.,I am confident in new situations.,My friends treat me with respect and kindness,I am good at making and keeping friends.,I feel safe at school.,I am good at handling conflict.,People generally like me.,I can go to my family for support and advice.,...,Other people pick on me.,I get along better with adults than people my age.,I try to be nice to other people.,I usually share with others.,"I am helpful if someone is hurt, upset or feeling ill.",I often volunteer to help others.,"In the last month, how often have you felt that you were unable to control the important things in your life?","In the last month, how often have you felt confident about your ability to handle your personal problems?","In the last month, how often have you felt that things were going your way?","In the last month, how often have you felt difficulties were piling up so high that you could not overcome them?"
Student,yes,yes,no,yes,yes,yes,yes,yes,yes,yes,...,Not true,Not true,Certainly true,Somewhat true,Certainly true,Somewhat true,almost never,fairly often,fairly often,almost never
Student,yes,yes,no,yes,yes,yes,yes,yes,yes,yes,...,Not true,Somewhat true,Certainly true,Certainly true,Certainly true,Certainly true,almost never,fairly often,fairly often,almost never
Student,yes,yes,no,yes,yes,yes,yes,yes,yes,yes,...,Not true,Somewhat true,Certainly true,Somewhat true,Certainly true,Certainly true,never,often,fairly often,never
Student,yes,yes,no,yes,yes,yes,yes,yes,yes,yes,...,Not true,Somewhat true,Certainly true,Somewhat true,Certainly true,Somewhat true,never,often,sometimes,never
Student,yes,yes,yes,no,yes,yes,no,yes,yes,no,...,Somewhat true,Somewhat true,Certainly true,Certainly true,Certainly true,Certainly true,sometimes,almost never,never,fairly often
Student,yes,yes,no,no,yes,yes,yes,yes,yes,yes,...,Not true,Somewhat true,Certainly true,Somewhat true,Certainly true,Certainly true,never,often,sometimes,sometimes
Student,yes,yes,no,yes,yes,yes,yes,yes,yes,yes,...,Not true,Not true,Certainly true,Certainly true,Certainly true,Certainly true,sometimes,fairly often,fairly often,fairly often
Student,yes,yes,no,no,yes,yes,yes,yes,yes,"yes, no",...,Not true,Somewhat true,Certainly true,Somewhat true,Certainly true,Certainly true,almost never,fairly often,sometimes,almost never
Student,"yes, no",no,no,yes,yes,yes,yes,yes,yes,yes,...,Not true,Somewhat true,Somewhat true,Somewhat true,Somewhat true,Somewhat true,almost never,almost never,almost never,almost never
Student,yes,"yes, no",no,yes,yes,yes,yes,yes,yes,yes,...,Not true,Not true,Certainly true,Somewhat true,Somewhat true,Somewhat true,never,fairly often,fairly often,never


In [5]:
#Search for all incidents of yes, no responses and replace with unsure
for col in df.columns:
    df.loc[df[col].str.contains(','), col] = 'unsure'

In [6]:
#Adding number to student in index to keep track of individual student responses
df.reset_index(inplace=True)

for i in range(len(df)):
    df.iloc[i,0] = ('Student ' + str(i))

df.set_index('index', inplace=True)

In [7]:
#Function to report percentages of most common response for each statement.
#Optional arguments to include questions with a specific word and drop the column from the DataFrame
def results_search(df, word='', drop_col=False):
    #Initialize empty list 
    li=[]

    data=df.loc[:, df.columns.str.contains(word)]

    #Calculating the percentage
    for statement in data.columns:
        count = data[statement].value_counts()
        results = round(count.max()/count.sum(), 3)

        #Appending to empty list to create list containing results for all questions/statements from survey
        li.append([statement, results, count])

    #Ordering list based on the 2nd element (results) of each question/statement
    ordered_li = sorted(li, key=itemgetter(1), reverse=True)

    #Attempting to print in an easier to read format.
    for entry in ordered_li:
        print('-----------------------------','\n')
        for element in entry:
            print(element)

    #Remove columns from original dataframe to reduce size (defaults to False)
    df.drop(data.columns, axis=1, inplace=drop_col)

At this point, I started searching use a variety of keywords.  I didn't drop the questions in the beginning, but spent some time searching for various common terms you would expect in a social-emotional well-being survey.  The ones you see below are the ones that I included in my report.

In [8]:
results_search(df, 'family', drop_col=True)

----------------------------- 

My family is supportive of me to follow my dreams and achieve a college education.
1.0
yes    49
Name: My family is supportive of me to follow my dreams and achieve a college education., dtype: int64
----------------------------- 

I receive a lot of love and support from my family members.
0.898
yes       44
no         3
unsure     2
Name: I receive a lot of love and support from my family members., dtype: int64
----------------------------- 

My family models appropriate and responsible behavior.
0.878
yes       43
unsure     4
no         2
Name: My family models appropriate and responsible behavior., dtype: int64
----------------------------- 

I can go to my family for support and advice.
0.776
yes       38
no         8
unsure     3
Name: I can go to my family for support and advice., dtype: int64
----------------------------- 

I feel understood by my family.
0.673
yes       33
no        14
unsure     2
Name: I feel understood by my family., dtype: 

In [9]:
results_search(df,'friends', drop_col=True)

----------------------------- 

My friends want me to succeed.
0.939
yes       46
unsure     3
Name: My friends want me to succeed., dtype: int64
----------------------------- 

My friends treat me with respect and kindness
0.939
yes       46
no         2
unsure     1
Name: My friends treat me with respect and kindness, dtype: int64
----------------------------- 

I am good at making and keeping friends.
0.878
yes       43
no         5
unsure     1
Name: I am good at making and keeping friends., dtype: int64
----------------------------- 

My best friends model responsible behavior.
0.796
yes       39
no         6
unsure     4
Name: My best friends model responsible behavior., dtype: int64


In [10]:
results_search(df, 'stress', drop_col=True)

----------------------------- 

Sometimes I use use alcohol or drugs to manage my stress.
0.959
no     47
yes     2
Name: Sometimes I use use alcohol or drugs to manage my stress., dtype: int64
----------------------------- 

I manage my stress easily.
0.633
yes       31
no        15
unsure     3
Name: I manage my stress easily., dtype: int64
----------------------------- 

Sometimes I feel so stressed out that I don't know what to do.
0.592
yes    29
no     20
Name: Sometimes I feel so stressed out that I don't know what to do., dtype: int64


In [11]:
results_search(df, 'emotion', drop_col=True)

----------------------------- 

When I’m upset, I acknowledge my emotions.
0.265
most of the time       13
sometimes              12
about half the time    11
almost never            8
almost always           5
Name: When I’m upset, I acknowledge my emotions., dtype: int64


In [12]:
results_search(df, 'upset', drop_col=True)

----------------------------- 

When I’m upset, I become out of control.
0.816
almost never           40
sometimes               5
about half the time     2
most of the time        2
Name: When I’m upset, I become out of control., dtype: int64
----------------------------- 

When I’m upset, I lose control over my behavior.
0.796
almost never           39
sometimes               8
about half the time     1
most of the time        1
Name: When I’m upset, I lose control over my behavior., dtype: int64
----------------------------- 

I am helpful if someone is hurt, upset or feeling ill.
0.776
Certainly true    38
Somewhat true     11
Name: I am helpful if someone is hurt, upset or feeling ill., dtype: int64
----------------------------- 

When I’m upset, I have difficulty controlling my behavior.
0.735
almost never           36
sometimes              10
about half the time     2
almost always           1
Name: When I’m upset, I have difficulty controlling my behavior., dtype: int64
------

At this point, I went back to looking at the remaining questions all at once, as there didn't seem to be any great keywords that I could use to search anymore.

In [13]:
results_search(df)

----------------------------- 

People generally like me.
0.959
yes       47
unsure     1
no         1
Name: People generally like me., dtype: int64
----------------------------- 

I am optimistic about my future.
0.959
yes    47
no      2
Name: I am optimistic about my future., dtype: int64
----------------------------- 

Sometimes I hurt or cut on myself.
0.959
no     47
yes     2
Name: Sometimes I hurt or cut on myself., dtype: int64
----------------------------- 

I feel safe at school.
0.939
yes       46
unsure     2
no         1
Name: I feel safe at school., dtype: int64
----------------------------- 

I am a positive person.
0.939
yes       46
unsure     3
Name: I am a positive person., dtype: int64
----------------------------- 

Sometimes I pick on other people or bully them.
0.939
no        46
unsure     2
yes        1
Name: Sometimes I pick on other people or bully them., dtype: int64
----------------------------- 

Sometimes I restrict my food intake or make myself throw up

In [14]:
#Searching for students in potential crisis
df.loc[(df['Sometimes I hurt or cut on myself.']=='yes') |
      (df['Sometimes I restrict my food intake or make myself throw up.']=='yes') |
      (df['Sometimes I feel like the world would be better off without me in it.']=='yes'),
      ['Sometimes I hurt or cut on myself.', 
       'Sometimes I restrict my food intake or make myself throw up.', 
       'Sometimes I feel like the world would be better off without me in it.']
      ].T

index,Student 16,Student 20,Student 31,Student 32,Student 36,Student 37,Student 41,Student 45
Sometimes I hurt or cut on myself.,yes,no,no,no,no,yes,no,no
Sometimes I restrict my food intake or make myself throw up.,yes,no,no,yes,no,no,yes,yes
Sometimes I feel like the world would be better off without me in it.,no,yes,yes,yes,yes,yes,no,yes
