## Analyzing Students' Mental Health

### Overview

A Japanese international university surveyed its students in 2018 and published a study in the following   
year that was approved by several ethical and regulatory boards.

The study found that international students have a higher risk of mental health difficulties than the   
general population, and that social connectedness (belonging to a social group) and acculturative stress   
(stress associated with joining a new culture) are predictive of depression.

Explore the `students` data using PostgreSQL to find out if you would come to a similar conclusion for   
international students and see if the length of stay is a contributing factor.

#### Students Data: Description of relevant columns

| **Field Name**  | **Description**                                    |
|-----------------|----------------------------------------------------|
| `inter_dom`     | Types of students (international or domestic)      |
| `japanese_cate` | Japanese language proficiency                      |
| `english_cate`  | English language proficiency                       |
| `academic`      | Current academic level (undergraduate or graduate) |
| `age`           | Current age of student                             |
| `stay`          | Current length of stay in years                    |
| `todep`         | Total score of depression (PHQ-9 test)             |
| `tosc`          | Total score of social connectedness (SCS test)     |
| `toas`          | Total score of acculturative stress (ASISS test)   |

#### Libraries to Install

* `ipython-sql`, enables the use of SQL magic functions to write SQL queries in Jupyter notebook

* `psycopg2`, a PostgreSQL connector used to connect to Python

#### Import `students.csv` file to PostgreSQL Using PgAdmin

1. Create a table in PgAdmin
    * Make sure the columns have the same names and values as those in the CSV file.
    * Specify the data type that each column will contain.

2. Import the CSV file

Reference: https://learnsql.com/blog/how-to-import-csv-to-postgresql/


In [29]:
# Import python os module which has methods to interact with the operating system.
import os

In [30]:
# Get the values of the specified environment variables using the os.getenv() method.
user = os.getenv('DB_USER')
password = os.getenv('DB_PASS')
host = os.getenv('DB_HOST')
port = os.getenv('DB_PORT')

In [31]:
# Load ipython-sql
%load_ext sql 

# Set configuration to not display login credentials
%config SqlMagic.displaycon=False

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


#### Connect to PostgreSQL database

In [32]:
# Connect ipython-sql to postgresql database
db_connection = 'postgresql://{}:{}@{}:{}/project'.format(user, password, host, port)
%sql {db_connection}

#### 1. Explore and understand the data

In [33]:
%%sql
--Display first five records in the students dataset
SELECT *
FROM students
LIMIT 5;

5 rows affected.


inter_dom,region,gender,academic,age,age_cate,stay,stay_cate,japanese,japanese_cate,english,english_cate,intimate,religion,suicide,dep,deptype,todep,depsev,tosc,apd,ahome,aph,afear,acs,aguilt,amiscell,toas,partner,friends,parents,relative,profess,phone,doctor,reli,alone,others,internet,partner_bi,friends_bi,parents_bi,relative_bi,professional_bi,phone_bi,doctor_bi,religion_bi,alone_bi,others_bi,internet_bi
Inter,SEA,Male,Grad,24,4,5,Long,3,Average,5,High,,Yes,No,No,No,0,Min,34,23,9,11,8,11,2,27,91,5,5,6,3,2,1,4,1,3,4,,Yes,Yes,Yes,No,No,No,No,No,No,No,No
Inter,SEA,Male,Grad,28,5,1,Short,4,High,4,High,,No,No,No,No,2,Min,48,8,7,5,4,3,2,10,39,7,7,7,4,4,4,4,1,1,1,,Yes,Yes,Yes,No,No,No,No,No,No,No,No
Inter,SEA,Male,Grad,25,4,6,Long,4,High,4,High,Yes,Yes,No,No,No,2,Min,41,13,4,7,6,4,3,14,51,3,3,3,1,1,2,1,1,1,1,,No,No,No,No,No,No,No,No,No,No,No
Inter,EA,Female,Grad,29,5,1,Short,2,Low,3,Average,No,No,No,No,No,3,Min,37,16,10,10,8,6,4,21,75,5,5,5,5,5,2,2,2,4,4,,Yes,Yes,Yes,Yes,Yes,No,No,No,No,No,No
Inter,EA,Female,Grad,28,5,1,Short,1,Low,3,Average,Yes,No,No,No,No,3,Min,37,15,12,5,8,7,4,31,82,5,5,5,2,5,2,5,5,4,4,,Yes,Yes,Yes,No,Yes,No,Yes,Yes,No,No,No


In [34]:
%%sql
--Count the total number of records in the students dataset
SELECT COUNT(*) AS total_records
FROM students;

1 rows affected.


total_records
286


In [35]:
%%sql
--Count the number of records for domestic or international students
SELECT inter_dom,
COUNT (*) AS count_inter_dom
FROM students
GROUP BY inter_dom

3 rows affected.


inter_dom,count_inter_dom
,18
Dom,67
Inter,201


#### 2. Filter to understand the data for each student type

In [36]:
%%sql
--Filter the dataset for international students and display the first five records
SELECT *
FROM students
WHERE inter_dom = 'Inter'
LIMIT 5;

5 rows affected.


inter_dom,region,gender,academic,age,age_cate,stay,stay_cate,japanese,japanese_cate,english,english_cate,intimate,religion,suicide,dep,deptype,todep,depsev,tosc,apd,ahome,aph,afear,acs,aguilt,amiscell,toas,partner,friends,parents,relative,profess,phone,doctor,reli,alone,others,internet,partner_bi,friends_bi,parents_bi,relative_bi,professional_bi,phone_bi,doctor_bi,religion_bi,alone_bi,others_bi,internet_bi
Inter,SEA,Male,Grad,24,4,5,Long,3,Average,5,High,,Yes,No,No,No,0,Min,34,23,9,11,8,11,2,27,91,5,5,6,3,2,1,4,1,3,4,,Yes,Yes,Yes,No,No,No,No,No,No,No,No
Inter,SEA,Male,Grad,28,5,1,Short,4,High,4,High,,No,No,No,No,2,Min,48,8,7,5,4,3,2,10,39,7,7,7,4,4,4,4,1,1,1,,Yes,Yes,Yes,No,No,No,No,No,No,No,No
Inter,SEA,Male,Grad,25,4,6,Long,4,High,4,High,Yes,Yes,No,No,No,2,Min,41,13,4,7,6,4,3,14,51,3,3,3,1,1,2,1,1,1,1,,No,No,No,No,No,No,No,No,No,No,No
Inter,EA,Female,Grad,29,5,1,Short,2,Low,3,Average,No,No,No,No,No,3,Min,37,16,10,10,8,6,4,21,75,5,5,5,5,5,2,2,2,4,4,,Yes,Yes,Yes,Yes,Yes,No,No,No,No,No,No
Inter,EA,Female,Grad,28,5,1,Short,1,Low,3,Average,Yes,No,No,No,No,3,Min,37,15,12,5,8,7,4,31,82,5,5,5,2,5,2,5,5,4,4,,Yes,Yes,Yes,No,Yes,No,Yes,Yes,No,No,No


In [37]:
%%sql
--Filter the dataset for domestic students and display the first five records
SELECT *
FROM students
WHERE inter_dom = 'Dom'
LIMIT 5;


5 rows affected.


inter_dom,region,gender,academic,age,age_cate,stay,stay_cate,japanese,japanese_cate,english,english_cate,intimate,religion,suicide,dep,deptype,todep,depsev,tosc,apd,ahome,aph,afear,acs,aguilt,amiscell,toas,partner,friends,parents,relative,profess,phone,doctor,reli,alone,others,internet,partner_bi,friends_bi,parents_bi,relative_bi,professional_bi,phone_bi,doctor_bi,religion_bi,alone_bi,others_bi,internet_bi
Dom,JAP,Female,Grad,27,5,2,Medium,3,Average,3,Average,Yes,Yes,No,Yes,Major,12,Mod,47,16,11,5,8,7,3,31,81,7,3,7,1,6,6,1,5,4,1,,Yes,No,Yes,No,Yes,Yes,No,Yes,No,No,No
Dom,JAP,Female,Under,18,1,1,Short,5,High,3,Average,No,No,No,No,No,9,Mild,48,9,4,5,4,3,2,10,37,4,4,4,4,1,1,1,1,1,1,4.0,No,No,No,No,No,No,No,No,No,No,No
Dom,JAP,Female,Under,21,3,3,Medium,5,High,3,Average,Yes,No,No,No,No,7,Mild,40,16,8,10,8,6,4,20,72,6,6,7,1,1,1,5,1,1,1,4.0,Yes,Yes,Yes,No,No,No,Yes,No,No,No,No
Dom,JAP,Male,Under,20,2,3,Medium,5,High,1,Low,No,No,No,No,No,3,Min,47,11,4,5,4,5,2,12,43,1,5,5,3,1,1,3,1,1,1,3.0,No,Yes,Yes,No,No,No,No,No,No,No,No
Dom,JAP,Female,Under,21,3,3,Medium,5,High,1,Low,No,No,Yes,Yes,Other,10,Mod,48,8,4,5,4,3,2,10,36,7,5,7,1,1,1,1,1,1,1,1.0,Yes,Yes,Yes,No,No,No,No,No,No,No,No


In [38]:
%%sql
--Filter the dataset for students not in the domestic or international categories
SELECT *
FROM students
WHERE inter_dom NOT IN ('Inter', 'Dom');

0 rows affected.


inter_dom,region,gender,academic,age,age_cate,stay,stay_cate,japanese,japanese_cate,english,english_cate,intimate,religion,suicide,dep,deptype,todep,depsev,tosc,apd,ahome,aph,afear,acs,aguilt,amiscell,toas,partner,friends,parents,relative,profess,phone,doctor,reli,alone,others,internet,partner_bi,friends_bi,parents_bi,relative_bi,professional_bi,phone_bi,doctor_bi,religion_bi,alone_bi,others_bi,internet_bi


#### 3. Query the summary statistics of the diagnostics scores for `all` students

In [39]:
%%sql
--Summary statistics of total scores of depression (todep), social connectedness (tosc), 
--and acculturative stress (toas)
SELECT
    MIN(todep) AS min_phq,
    MAX(todep) AS max_phq,
    ROUND(AVG(todep), 2) AS avg_phq,
    MIN(tosc) AS min_scs,
    MAX(tosc) AS max_scs,
    ROUND(AVG(tosc), 2) AS avg_scs,
    MIN(toas) AS min_as,
    MAX(toas) AS max_as,
    ROUND(AVG(toas), 2) AS avg_as
FROM students;


1 rows affected.


min_phq,max_phq,avg_phq,min_scs,max_scs,avg_scs,min_as,max_as,avg_as
0,25,8.19,8,48,37.47,36,145,72.38


#### 4. Summarize the diagnostics scores for `international` students only

In [40]:
%%sql
--Summary statistics of total scores of depression (todep), social connectedness (tosc), 
--and acculturative stress (toas)
SELECT
    MIN(todep) AS min_phq,
    MAX(todep) AS max_phq,
    ROUND(AVG(todep), 2) AS avg_phq,
    MIN(tosc) AS min_scs,
    MAX(tosc) AS max_scs,
    ROUND(AVG(tosc), 2) AS avg_scs,
    MIN(toas) AS min_as,
    MAX(toas) AS max_as,
    ROUND(AVG(toas), 2) AS avg_as
FROM students
WHERE inter_dom = 'Inter';

1 rows affected.


min_phq,max_phq,avg_phq,min_scs,max_scs,avg_scs,min_as,max_as,avg_as
0,25,8.04,11,48,37.42,36,145,75.56


#### 5. See the impact of length of stay of an international student on average diagnostic scores

In [41]:
%%sql
--Average scores of depression (todep), social connectedness (tosc), and acculturative stress (toas)
--based on length of stay
SELECT
    stay AS length_of_stay,
    ROUND(AVG(todep), 2) AS average_phq,
    ROUND(AVG(tosc), 2) AS average_scs,
    ROUND(AVG(toas), 2) AS average_as
FROM students
WHERE inter_dom = 'Inter'
GROUP BY stay 
ORDER BY stay DESC;

9 rows affected.


length_of_stay,average_phq,average_scs,average_as
10,13.0,32.0,50.0
8,10.0,44.0,65.0
7,4.0,48.0,45.0
6,6.0,38.0,58.67
5,0.0,34.0,91.0
4,8.57,33.93,87.71
3,9.09,37.13,78.0
2,8.28,37.08,77.67
1,7.48,38.11,72.8


Based on the average depression scores (average_phq), the above analysis does show that international   
students hava a higher risk of mental health difficulties especially those who stay for longer periods.

However, longer stays also lead to lower average scores of acculturative stress (average_as), as      
students likely adapt by slowly getting used to the culture of a foreign land.

References:

PostgreSQL Integration with Jupyter Notebook   
https://medium.com/analytics-vidhya/postgresql-integration-with-jupyter-notebook-deb97579a38d

How to use Psycopg2: The PostgreSQL Adapter for Python   
https://www.timescale.com/blog/how-to-use-psycopg2-the-postgresql-adapter-for-python/

Accessing Databases with SQL Magic   
https://gist.github.com/ttadesusi/69224203c01ff107f735d66496bf26a2