# Setup All Workshop Dependencies

## _Note:  This Notebook Will Take A Few Minutes To Complete._

## _Please Be Patient._

## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

<a name='1'></a>
## Set up Kernel and Required Dependencies

First, check that the correct kernel is chosen.

<img src="img/kernel_set_up.png" width="300"/>

You can click on that to see and check the details of the image, kernel, and instance type.

<img src="img/w3_kernel_and_instance_type.png" width="600"/>

In [None]:
!python --version

In [None]:
import psutil

notebook_memory = psutil.virtual_memory()
print(notebook_memory)

if notebook_memory.total < 32 * 1000 * 1000 * 1000:
    print('*******************************************')    
    print('YOU ARE NOT USING THE CORRECT INSTANCE TYPE')
    print('PLEASE CHANGE INSTANCE TYPE TO  m5.2xlarge ')
    print('*******************************************')
else:
    correct_instance_type=True

In [None]:
import boto3
import sagemaker

sess = sagemaker.Session()
bucket = sess.default_bucket()
region = boto3.Session().region_name

# SageMaker
## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

In [None]:
%pip install --disable-pip-version-check -q sagemaker==2.136.0

In [None]:
%pip install --disable-pip-version-check -q sagemaker-experiments==0.1.28

## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s above. <==_

# PyTorch
## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

In [None]:
%pip uninstall -y --disable-pip-version-check \
    torch==1.13.1 \
    torchdata==0.5.1 --quiet

In [None]:
%pip install --disable-pip-version-check \
    torch==1.13.1 \
    torchdata==0.5.1 --quiet

## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s above. <==_

# Hugging Face Transformers and other language libraries
## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

In [None]:
%pip install --disable-pip-version-check \
    transformers==4.27.2 \
    datasets==2.11.0 \
    accelerate==0.17.0 \
    bitsandbytes==0.37.0 \
    promptsource==0.2.3 \
    evaluate==0.4.0 \
    py7zr==0.20.4 \
    sentencepiece==0.1.99 \
    rouge_score==0.1.2 \
    loralib==0.1.1 \
    peft==0.3.0 \
    git+https://github.com/lvwerra/trl.git@25fa1bd --quiet

## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s above. <==_

# Ray
## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

In [None]:
%pip install --disable-pip-version-check -q ray==2.5.0

## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s above. <==_

# PyAthena
## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

In [None]:
%pip install --disable-pip-version-check -q PyAthena==2.1.1

## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s above. <==_

# Redshift
## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

In [None]:
!pip install --disable-pip-version-check -q SQLAlchemy==1.3.23
!pip install --disable-pip-version-check -q psycopg2-binary==2.9.1

## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s above. <==_

# Zip
## _==> Please ignore all WARNINGs and ERRORs from the `conda install`'s below. <==_

In [None]:
%conda install -q -y zip

# Matplotlib
## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

In [None]:
%pip install --disable-pip-version-check -q matplotlib==3.1.3

## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s above. <==_

# Seaborn
## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

In [None]:
%pip install --disable-pip-version-check -q seaborn==0.10.0

## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s above. <==_

# Scikit-Learn
## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

In [None]:
%pip install --disable-pip-version-check -q scikit-learn==0.23.1

## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s above. <==_

In [None]:
import sagemaker

try:
    role = sagemaker.get_execution_role()
    print(role)
except:
    #role='<SAGEMAKER_EXECUTION_ROLE_NAME>' # ie. arn:aws:iam::<ACCOUNT_ID>:role/service-role/AmazonSageMaker-ExecutionRole-XXXXXXXXXXXX
    print('If there is an exception, please set the role directly by uncommenting the "role=..." line above.')
    pass

%store role

# Summarize Python Dependencies
## _==> Please ignore all WARNINGs and ERRORs from the `pip install`'s below. <==_

In [None]:
!python --version

In [None]:
!pip list

# Setup Datasets

# Copy Data From the Public S3 Bucket to our Private S3 Bucket in this Account
As the full dataset is pretty large, let's just copy 3 files into our bucket to speed things up later. 

In [None]:
s3_public_path_tsv = "s3://dsoaws/tsv"

In [None]:
s3_private_path_tsv = "s3://{}/amazon-reviews-pds/tsv".format(bucket)
print(s3_private_path_tsv)

In [None]:
!aws s3 cp --recursive $s3_public_path_tsv/ $s3_private_path_tsv/ --exclude "*" --include "amazon_reviews_us_Books_v1_00.tsv.gz"
!aws s3 cp --recursive $s3_public_path_tsv/ $s3_private_path_tsv/ --exclude "*" --include "amazon_reviews_us_Books_v1_02.tsv.gz"
!aws s3 cp --recursive $s3_public_path_tsv/ $s3_private_path_tsv/ --exclude "*" --include "amazon_reviews_us_Digital_Video_Download_v1_00.tsv.gz"

In [None]:
%store s3_public_path_tsv

In [None]:
%store s3_private_path_tsv

# Copy Dialogue Summary Datasets to our Private S3 Bucket

In [None]:
sess.upload_data('./data-summarization/dialogsum-1.csv', bucket=bucket, key_prefix='data-summarization')
sess.upload_data('./data-summarization/dialogsum-2.csv', bucket=bucket, key_prefix='data-summarization')

In [None]:
raw_input_data_s3_uri = f's3://{bucket}/data-summarization/'
print(raw_input_data_s3_uri)

In [None]:
!aws s3 ls $raw_input_data_s3_uri

In [None]:
%store raw_input_data_s3_uri

In [None]:
setup_dependencies_passed = True

# Store the Variables for the Next Notebooks
We can use these variables in the next notebooks.  For more information on %store, see the documentation here: https://ipython.readthedocs.io/en/stable/config/extensions/storemagic.html

In [None]:
model_checkpoint='google/flan-t5-base'

In [None]:
%store model_checkpoint

In [None]:
%store setup_dependencies_passed

In [None]:
%store

# Release Resources

In [None]:
%%html

<p><b>Shutting down your kernel for this notebook to release resources.</b></p>
<button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>
        
<script>
try {
    els = document.getElementsByClassName("sm-command-button");
    els[0].click();
}
catch(err) {
    // NoOp
}    
</script>