# Assignment 3: Labor Trafficking in America

### Substantive Objectives
The goal of this assignment is to place individual cases of forced labor in a broader systemic framework, synthesizing across multiple sources. Please 

### Coding Objectives
Tabulation practice!
* `filter()`
* `group_by()`
* `summarise()`
* `count()`
* `n()`
* `arrange(desc(column_name))`

## Setup
The code chunk below loads the packages that we need. 

In [None]:
# You *must* run this cell first. Do not change the contents of this cell.
library(testthat)
library(ottr)
library(tidyverse) %>% suppressMessages()

The code chunk below loads the ctdc dataset.

In [None]:
# You *must* run this cell first. Do not change the contents of this cell.
ctdc <- read.csv("2024_CTDC_synthetic.csv")

---
## Question 1

<!-- BEGIN QUESTION -->

The following are potential cases of trafficking generated by ChatGPT. 

Write a response to the following scenarios as if you are a human trafficking consultant for a social worker assisting in the case. Each response should **use examples/evidence from readings and lecture** and **each citing at least 3 sources (they can overlap).** 

*(Max: 500 words per response - flexible)* 

* **Paragraph 1: Argument** - Is this a case of human trafficking according to TVPA, Palermo protocol, the spirit of trafficking, why?
* **Paragraph 2: Job specific vulnerabilities** - What made this individual vulnerable to being trafficked into this job? What features of the job makes it a common occupation where trafficking is observed?
* **Paragraph 3: US legal system** - What about the US laws around this sector and broader system reinforces the patterns of trafficking?
* **Paragraph 4: Investigation Outcome** - Based on the readings, what do you suspect was the outcome of the investigation?
* **Paragraph 5: Proposal** - What is one promising program to address a case like this, and what are the strengths and limitations of your proposed program?


---

#### **1a) Domestic Work in the US**

(10 points)

A 27-year-old woman named Maria is recruited from her home country in Central America to work as a live-in housekeeper and nanny for a wealthy family in California. The job offer promises good wages, a place to live, and an opportunity for Maria to send money home to support her family.

Upon arrival, however, Maria's situation turns out very differently. Her employers confiscate her passport and work visa, telling her it’s for “safe-keeping.” Maria is made to work 14-16 hours a day, seven days a week, cleaning the house, cooking, and caring for the children, with no days off. She sleeps on a mattress in a small utility room, is provided minimal food, and is prohibited from leaving the house without permission.

The family threatens Maria, telling her that if she tries to escape or report her situation, they will contact immigration authorities and have her deported. She is paid far less than was initially promised—sometimes nothing at all—leaving her trapped and unable to send money home as planned.

Over time, Maria’s physical and mental health deteriorates due to overwork, isolation, and abuse. She is kept from accessing outside help and doesn’t know the language or her rights in the U.S. Eventually, she manages to confide in a neighbor or a delivery worker, who notices the signs of exploitation and reports the situation to authorities. An investigation is launched, and the results of the investigation remain unknown.

_Type your answer here, replacing this text._

<!-- END QUESTION -->

<!-- BEGIN QUESTION -->

--- 

#### **Question 1b: Agricultural Work in the US**

(10 points)

Carlos, a 35-year-old man from Mexico, is recruited by a labor contractor to work on a farm in Florida. The recruiter promises him a well-paying job with housing included, and assures Carlos that he will receive legal work documents upon arrival. Desperate to provide for his family, Carlos agrees, even paying a high recruitment fee to secure the job.

When Carlos arrives, he quickly realizes that the situation is far from what was promised. His work visa is either nonexistent or forged, and his living conditions are deplorable—he shares a small, overcrowded shack with several other workers. There is little to no access to running water, sanitation, or proper nutrition.

Carlos and the other workers are forced to work long hours, sometimes 12-14 hours a day, seven days a week, picking crops under extreme heat. They are paid far less than the minimum wage, often receiving less than promised or not being paid at all. If the workers protest or try to leave, the contractor threatens them with deportation or physical violence, or claims that they still owe recruitment debts.

The labor contractor confiscates Carlos's identification documents, telling him that he will get them back once he has "worked off" his recruitment fees. Workers are constantly monitored, and they are forbidden from contacting anyone outside the farm. There is little to no medical assistance, and anyone who becomes sick or injured is either ignored or threatened.

After several months of exploitation, Carlos meets a local volunteer from a migrant worker advocacy group during a rare trip to town. The volunteer becomes suspicious of the signs of abuse and contacts law enforcement, leading to an investigation. The results of the investigation remain unknown. 


_Type your answer here, replacing this text._

<!-- END QUESTION -->

<!-- BEGIN QUESTION -->

--- 
## Question 2: Statistics and Interpretation

**a) (2 points) The following estimate is taken from [Boittin and Mo et al. study](https://bcourses.berkeley.edu/courses/1536438/files/folder/Week%207?preview=89670807) (2024) on migrant domestic workers in Hong Kong.**


The government wants to know if it's worth spending money on awareness raising on training employers of migrant domestic workers.  They hire researchers to run an experiment on employers where half of them are randomly assigned to receive a training, and half of them are randomly assigned to not. 

They find that employers who do receive trainings scored 7.6 percentage points higher on test out of 100 percentage points measuring their knowledge on rights that migrant domestic workers have. If you were the researcher, how confident would you be in this relationship? Why?

 $$KnowledgeTestScore = \alpha + \beta_1 ReceivedTraining$$

where $\beta_1  = 0.076$ *(this should be converted to percentage points for interpretation)*

*(Approx: 2-3 sentences).*

_Type your answer here, replacing this text._

<!-- END QUESTION -->

<!-- BEGIN QUESTION -->

**b) (2 points) The chart below is taken from this [factsheet](https://polarisproject.org/wp-content/uploads/2019/07/Domestic_Worker_Fact_Sheet.pdf) by Polaris, and indicates where potential trafficking cases has been reported. What does the chart below say about the distribution of trafficking cases involving domestic workers in the US? Provide at least 3 potential reasons for why some areas are more sparse than others, apart from population density.**

Provide at least one citation from lecture, reading, and/or outside source as evidence for your answer. 

*(Approx: 3-4 sentences)*

<img src="polaris_domestic_worker_cases.png" width="700">


_Type your answer here, replacing this text._

<!-- END QUESTION -->

<!-- BEGIN QUESTION -->

**c) (2 points) Run the code below. What are the three most represented countries of origin for forced domestic work cases in America? What can we learn from this analysis?**

*Think carefully about what we can and **cannot** infer from this statistic*

*(Approx: 3-4 sentences)*

Run the code below. It does the following...

From the CTDC dataset:
1) Filter to confirmed cases of forced labor in the USA in the sector of domestic work. 
2) Groups by the country of origin: `citizenship` (If we did not provide this to you, this could be identified via the [CTDC codebook](https://www.ctdatacollaborative.org/sites/g/files/tmzbdl2011/files/2024-02/Codebook_CTDC_global_synthetic_data_v2024.pdf)) 
3) Count the number in each group (i.e. count the number of cases from each country of origin) 
4) Sorts in descending order by the count in each country of origin

In [None]:
# RUN CODE, DO NOT CHANGE
ctdc %>% 
    # filter to relevant subset
    filter(CountryOfExploitation == "USA" & # filter to usa
           isForcedLabour==1 & #filters to confirmed forced labor
           typeOfLabourDomesticWork == 1) %>%  # filters to domestic work
    # Groupby variable of interest - citizenship
    group_by(citizenship) %>%
    # finds the the number of observations within each group, naming this column "count"
    summarise(count = n()) %>% 
    # sort in descending order
    arrange(desc(count))

_Type your answer here, replacing this text._

<!-- END QUESTION -->

**d) (2 points)** Find the distribution across _age brackets_ for **confirmed** forced labor cases in the **USA** that have reported affirmatively for **Debt Bondage** for **agriculture**. This should be stored in `agriculture_workers`. Order in descending order by the count for each age bracket. 

Repeat for construction, storing this dataframe in `construction_workers`.

**Steps (very similar to example code in 2c):**
1. Filter the dataframe to the specified conditions *(Hint: you should include four conditions that are bolded)*
2. Group by your variable of interest
3. Find the count across each age group using `summarise()` and `n()`. Make sure to name this new column `count` for the tests to pass. 
4. Sort in descending order using `arrange()` and `desc()`

Variables that matter: `ageBroad`, `isForcedLabour`, `meansDebtBondageEarnings`, `typeOfLabourAgriculture`, `CountryOfExploitation`

In [None]:
# store your tabulation for agriculture workers 
agriculture_workers <- NULL # YOUR CODE HERE

# store your tabulation for construction workers 
construction_workers <- NULL # YOUR CODE HERE

# display both tables
paste0("Age Distribution for Agriculture Workers")
agriculture_workers

paste0("Age Distribution for Construction Workers")
construction_workers

*Checks for correctness, but did not have time to build in detailed error messages. Ask a friend or post of Ed if you get stuck!*

In [None]:
. = ottr::check("tests/q2d.R")

<!-- BEGIN QUESTION -->

**3e) (2 points)** From the results above, discuss...
> (i) How do the distributions compare? *(Approx 1-2 sentences)* \
(ii) How does the amount of missing data limit the ability to make inferences of demographics by sector? *(Approx 2-3 sentences)*

_Type your answer here, replacing this text._

<!-- END QUESTION -->

# Submitting Your Notebook (please read carefully!)

To submit your notebook...

### 1. Click `File` $\rightarrow$ `Save Notebook`.

### 2. Wait 5 seconds.

### 3. Select the cell below and hit run.tion:**

In [None]:
ottr::export("pset3.ipynb")

After you hit "Run" on the cell above, click the download link. A .zip file should download to your computer.

(If you make changes to your notebook, you'll need to hit save and then run the cell above again before you submit to get a new version of it.)

### 4. Submit the .zip file you just downloaded <a href="https://www.gradescope.com/" target="_blank">on Gradescope here</a>.

Notes:

- **This does not seem to work on Chrome for iPad or iPhone.** If you're using an iPad or iPhone, you need to download the file using **Safari**.
- If your web browser automatically unzips the .zip file (so you see a folder instead of a .zip file), you can just upload the .ipynb file that is inside the folder.
- If this method is not working for you, try this: hit `File`, then `Download as`, then `Notebook (.ipynb)` and submit that.