# AWS Setup

## IAM User

Create an IAM user with `AdministratorAccess` permission. Instruction can be found on P7 of "AWS New User Orientation.pdf".

**Make sure to download and save user credentials csv file on the last step.**

In [15]:
!ls

AWS New User Orientation.pdf aws_setup.ipynb


## AWS CLI 

P24 of "AWS New User Orientation.pdf"

Installation instruction can be found [here](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html).

After installation, in your terminal run `aws configure` to provide Access Key ID and Secret Access Key to set up credential. Recommend to use 'us-west-2' as default region name and 'json' as default output format. 

In [5]:
# verify setup is successful
!aws s3 ls

2022-10-15 10:09:25 sagemaker-studio-87bjv2x7mmh
2022-10-15 10:15:56 sagemaker-us-west-2-662235870471
2022-10-08 16:28:53 w210-trip


## Python API
Need to have AWS CLI setup first.

In [7]:
%pip install boto3 s3fs

Note: you may need to restart the kernel to use updated packages.


### boto3

In [9]:
import boto3 # aws api for python
import logging
from botocore.exceptions import ClientError
import pandas as pd

In [10]:
profile_name = "default"
region_name = "us-west-2"

In [11]:
session = boto3.Session(profile_name=profile_name)
s3 = session.client("s3", region_name=region_name)

In [12]:
# verify setup
response = s3.list_buckets()
buckets = response["Buckets"]
buckets

[{'Name': 'sagemaker-studio-87bjv2x7mmh',
  'CreationDate': datetime.datetime(2022, 10, 15, 17, 9, 25, tzinfo=tzutc())},
 {'Name': 'sagemaker-us-west-2-662235870471',
  'CreationDate': datetime.datetime(2022, 10, 15, 17, 15, 56, tzinfo=tzutc())},
 {'Name': 'w210-trip',
  'CreationDate': datetime.datetime(2022, 10, 8, 23, 28, 53, tzinfo=tzutc())}]

### s3fs

In [13]:
# s3fs allows s3 files system to be read and write like your local fs
df_buz = pd.read_json("s3://w210-trip/data/yelp_academic_dataset_business.json", lines=True)

In [14]:
df_buz.head(3)

Unnamed: 0,business_id,name,address,city,state,postal_code,latitude,longitude,stars,review_count,is_open,attributes,categories,hours
0,Pns2l4eNsfO8kk83dixA6A,"Abby Rappoport, LAC, CMQ","1616 Chapala St, Ste 2",Santa Barbara,CA,93101,34.426679,-119.711197,5.0,7,0,{'ByAppointmentOnly': 'True'},"Doctors, Traditional Chinese Medicine, Naturop...",
1,mpf3x-BjTdTEA3yCZrAYPw,The UPS Store,87 Grasso Plaza Shopping Center,Affton,MO,63123,38.551126,-90.335695,3.0,15,1,{'BusinessAcceptsCreditCards': 'True'},"Shipping Centers, Local Services, Notaries, Ma...","{'Monday': '0:0-0:0', 'Tuesday': '8:0-18:30', ..."
2,tUFrWirKiKi_TAnsVWINQQ,Target,5255 E Broadway Blvd,Tucson,AZ,85711,32.223236,-110.880452,3.5,22,0,"{'BikeParking': 'True', 'BusinessAcceptsCredit...","Department Stores, Shopping, Fashion, Home & G...","{'Monday': '8:0-22:0', 'Tuesday': '8:0-22:0', ..."


## SageMaker Studio - jupyter environment

Setup is straight forward and instruction can be found on P28 of the pdf. 

**SageMaker Studio starts to charge money when you open up the jupyter interface. It will keep running even after you close the tab. The only way to stop it is to delete the app under SageMaker Control Panel. Make sure to DELETE the app every time you finish.**

 ### Use github to access your code.

![git](./imgs/git.png)

### Change to more powerful machine

![select](./imgs/instance.png)

Uncheck "fast lunch only" to access more instance types.

![type](./imgs/type.png)

### Request quota
If you don't have quota for the machine type you want to use, you need to request it. 
* Click [here](https://us-west-2.console.aws.amazon.com/servicequotas/home/services/sagemaker/quotas) 
* Search "Kernel Gateway Apps running on ml.g"
* Apply for the one you need to use

### Install git-lfs on Sagemaker 

```
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.rpm.sh | sudo bash

sudo yum install git-lfs -y

git lfs install
```

## Huggingface (To be continued...)
The model saved in Huggingface can be deployed for inference. 

`git-lfs` is required to save model in Huggingface.

In [None]:
from huggingface_hub import notebook_login

notebook_login()

In [22]:
! git-lfs --version

git-lfs/3.2.0 (GitHub; darwin arm64; go 1.18.2)
