# Participant Management - Translation & Communication
## Large Scale Computing for the Social Sciences, Final Project
### Max Kramer
---
It is recommended to run this script on an EMR cluster, though it is entirely possible to run it locally or on the midway cluster as well. It is required to have `awscli` installed and your *.credentials* file updated with your AWS information. If you intend to run more than a few hundred participants, it is **strongly** recommended to run this notebook on an EMR cluster.

### Import libraries and initialize AWS clients

In [41]:
import csv
import boto3
import awscli
import pandas as pd

In [18]:
translate = boto3.client('translate') # initialize translation client

s3 = boto3.client('s3') # initialize s3 client for data storage

sns = boto3.client('sns') # initialize sns client for communication

In [30]:
response = translate.translate_text(
Text='I am testing the translator',
SourceLanguageCode='auto',
TargetLanguageCode='es')

In [31]:
print(response['TranslatedText'])

Estoy probando el traductor


### Read in data

In [36]:
response = s3.list_objects(Bucket='lcssfinal')

In [38]:
share_url = s3.generate_presigned_url(
  ClientMethod='get_object',
  ExpiresIn=3600,
  Params={'Bucket': 'lcssfinal','Key': 'ingest_output.csv'}
)

In [39]:
print(share_url)

https://lcssfinal.s3.amazonaws.com/ingest_output.csv?AWSAccessKeyId=ASIARDW23IV4H3QVD5HV&Signature=f6XqK9%2F4Ozdl9ezDwi6f%2B3OL058%3D&x-amz-security-token=FwoGZXIvYXdzEGcaDOQML3GLyC9HOisnqyLFAafcBVCFPMs4F3z1PNsDzI47OotT6W5cJglvmmou%2FwI5yPhPcDPJMHO7aSaV7UPtqFVOsdSDbOVoMcarXSUSO6ofoeZUe0rS8PbBuWmYPrStV3w7f09xQf9tPR7uzQAYqCHNYRoCgZtSFdluS3NLpUl3AaobQfRuo7tXgAzE2Be3zZlBaZ2DSi%2BR1ywkQYIFtvj6yB1SEPG4jxQeX4GBfjs1y45dqM16P77XfxyiyVOWYzCocWQywoX98dipYHLBMRMx6%2B8zKPiA54UGMi3i2fTBTbE1PpROYspR8jpXK6P3%2BqwQtpQLWzvVzbACcAz9ZgxgRe%2FccooWDAc%3D&Expires=1622791343


In [43]:
df_list =  [ ] 

for file in response['Contents']:
    obj = s3.get_object(Bucket='lcssfinal', Key=file['Key'])
    obj_df = pd.read_csv(obj['Body'])
    df_list.append(obj_df)
    
df = pd.concat(df_list)

df

Unnamed: 0,First Name,Last Name,Date of Birth,Height,Weight,Gender Identity,Handedness,Email Address,Cell Phone Number,Preferred Language
0,Max,Kramer,07/25/1997,"6'0""",215lbs,Male,Right,mkramer1@uchicago.edu,7733185225,en
1,Coen,Needell,07/22/1994,"5'10""",140lbs,Male,Right,mkramer1@uchicago.edu,7733185225,en
2,Deepa,Prasad,04/02/1993,"5'7""",120lbs,Female,Left,mkramer1@uchicago.edu,7733185225,en
3,Wilma,Bainbridge,10/12/1998,"5'4""",125lbs,Female,Left,mkramer1@uchicago.edu,7733185225,es
4,Leon,Zhou,07/25/1997,"6'1""",140lbs,Male,Right,mkramer1@uchicago.edu,7733185225,es
5,Madeline,Gedvila,07/22/1994,"5'5""",155lbs,Female,Right,mkramer1@uchicago.edu,7733185225,zh
6,Rebecca,Greenberg,04/02/1993,"5'6""",110lbs,Female,Left,mkramer1@uchicago.edu,7733185225,zh
7,Trent,Davis,10/12/1998,"5'7""",125lbs,Male,Right,mkramer1@uchicago.edu,7733185225,fr


### Generate SNS Topic & Subscribe Participants

In [53]:
cell_numbers = df['Cell Phone Number'] # get phone numbers
emails = df['Email Address'] # get email addresses

In [56]:
response = sns.create_topic(Name="IRB_compliance") # generate new SNS topic for IRB forms
IRB_ARN = response['TopicArn'] # get TopicArn

In [71]:
for index, row in df.iterrows():
    resp_email = sns.subscribe(
    TopicArn = IRB_ARN,
    Protocol = 'email', Endpoint=row['Email Address'])
    
    resp_cell = sns.subscribe(
    TopicArn = IRB_ARN,
    Protocol = 'sms', Endpoint="+1"+ str(row["Cell Phone Number"]))
    break

In [72]:
message = "this is just a test"
subject = "testing SNS for LCSS final"