In [1]:
# Setup SQL magic using DuckDB so that the notebook can be downloaded and run anywhere
%load_ext sql
%config SqlMagic.autopandas = True
%config SqlMagic.displaycon = False
%config SqlMagic.feedback = False
%sql duckdb:///:memory:

In [4]:
%%sql
# Create the students table
DROP TABLE IF EXISTS students;
CREATE TABLE students AS 
SELECT * FROM read_csv_auto('students.csv');

Unnamed: 0,Success


![Illustration of silhouetted heads](mentalhealth.jpg)

**_Does going to university in a different country affect your mental health?_** 

A Japanese international university surveyed its students in 2018 and published a study the following year that was approved by several ethical and regulatory boards.

The study found that **international students** have a **higher risk of mental health difficultie**s than the general population, and that **social connectedness** (belonging to a social group) and **acculturative stress** (stress associated with joining a new culture) are **predictive of depression**.


This project will explore the `students` data using **PostgreSQL** to find out if a similar conclusion can be drawn for international students and **see if the length of stay is a contributing factor**.

Here is a data description of the columns in the `students` data:

| Field Name    | Description                                      |
| ------------- | ------------------------------------------------ |
| `inter_dom`     | Types of students (international or domestic)   |
| `japanese_cate` | Japanese language proficiency                    |
| `english_cate`  | English language proficiency                     |
| `academic`      | Current academic level (undergraduate or graduate) |
| `age`           | Current age of student                           |
| `stay`          | Current length of stay in years                  |
| `todep`         | Total score of depression (PHQ-9 test)           |
| `tosc`          | Total score of social connectedness (SCS test)   |
| `toas`          | Total score of acculturative stress (ASISS test) |

Here is an overview of the `students` data:

In [5]:
%%sql
SELECT * 
FROM students;

Unnamed: 0,inter_dom,region,gender,academic,age,age_cate,stay,stay_cate,japanese,japanese_cate,...,friends_bi,parents_bi,relative_bi,professional_bi,phone_bi,doctor_bi,religion_bi,alone_bi,others_bi,internet_bi
0,Inter,SEA,Male,Grad,24,4,5,Long,3,Average,...,Yes,Yes,No,No,No,No,No,No,No,No
1,Inter,SEA,Male,Grad,28,5,1,Short,4,High,...,Yes,Yes,No,No,No,No,No,No,No,No
2,Inter,SEA,Male,Grad,25,4,6,Long,4,High,...,No,No,No,No,No,No,No,No,No,No
3,Inter,EA,Female,Grad,29,5,1,Short,2,Low,...,Yes,Yes,Yes,Yes,No,No,No,No,No,No
4,Inter,EA,Female,Grad,28,5,1,Short,1,Low,...,Yes,Yes,No,Yes,No,Yes,Yes,No,No,No
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
281,,,,,,,,,,,...,222,,,,,,,,,
282,,,,,,,,,,,...,249,,,,,,,,,
283,,,,,,,,,,,...,203,,,,,,,,,
284,,,,,,,,,,,...,247,,,,,,,,,


To analyze how the **length of stay impacts the average mental health diagnostic scores** of the **international students** present in the study, we select the following columns:
* Length of **stay**
* **Count** of **international students**
* **Average** score of **depression**
* **Average** score of **social connectedness**
* **Average** score of **acculturative stress**

We group and order by **stay** so we're able to view the impact its length has on the different scores.  

In [6]:
%%sql
SELECT 
	stay, -- length of stay
	COUNT(inter_dom) AS count_int, -- count of students for each length of the stay
	ROUND(AVG(todep),2) AS average_phq, -- average score of depression rounded to 2 decimal places
	ROUND(AVG(tosc),2) AS average_scs, -- average score of social connectedness rounded to 2 decimal places
	ROUND(AVG(toas),2) AS average_as -- average score of acculturative stress rounded to 2 decimal places
FROM students
WHERE inter_dom = 'Inter' -- filter on international students
GROUP BY stay
ORDER BY stay DESC;

Unnamed: 0,stay,count_int,average_phq,average_scs,average_as
0,10,1,13.0,32.0,50.0
1,8,1,10.0,44.0,65.0
2,7,1,4.0,48.0,45.0
3,6,3,6.0,38.0,58.67
4,5,1,0.0,34.0,91.0
5,4,14,8.57,33.93,87.71
6,3,46,9.09,37.13,78.0
7,2,39,8.28,37.08,77.67
8,1,95,7.48,38.11,72.8


The **impact of the length of stay on depression** can be shown in the following graph, showing the **average depression scores by length of stay** for international students:

<img src="Average%20depression%20scores%20by%20length%20of%20stay%20for%20international%20students.png" width="75%">


For students staying 1 to 3 years, **depression scores increase as the stay gets longer**. The depression scores decrease slightly between the 3rd and the 4th year. 

The results after 4 years need to be interpreted with caution, since they are only based on a small sample of students (1 to 3 students), which may not be representative of the whole international students population.

This increase of depression scores with the length of stay is a **first evidence that the length of stay is a contributing factor to mental health difficulties** for international students.

The **impact of length of stay** on **social connectedness** and **acculturative stress** can then be shown on the following graphs, showing the **average social connectedness scores by length of stay** and the **averageacculturative stress scores by length of stay** for international students:

<img src="Average%20social%20connectedness%20scores%20by%20length%20of%20stay%20for%20international%20students.png" alt="Average social connectedness scores by length of stay for international students" width=75%>

<img src="Average%20acculturative%20stress%20scores%20by%20length%20of%20stay%20for%20international%20students.png" alt="Average acculturative stress scores by length of stay for international students" width=75%>

Between 1 and 4 years, **social connectedness decreases** while **acculturative stress increases** with the stay getting longer.

After the 4th year, the results are less significant, with the sample size being reduced. 

These factors, that were proven to be predictive of depression, are **further evidence that stay length is a contributing factor to mental health difficulties** for international students.

In conclusion, based on the average mental health diagnostic scores by length of stay, it can be concluded that **length of stay is a contributing factor to mental health difficulties**. 