# AWS Glue Studio Notebook
##### You are now running a AWS Glue Studio notebook; To start using your notebook you need to start an AWS Glue Interactive Session.


#### Optional: Run this cell to see available notebook commands ("magics").


In [None]:
%help

####  Run this cell to set up and start your interactive session.


In [1]:
%idle_timeout 2880
%glue_version 3.0
%worker_type G.1X
%number_of_workers 5

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
  
sc = SparkContext.getOrCreate()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)

Welcome to the Glue Interactive Sessions Kernel
For more information on available magic commands, please type %help in any new cell.

Please view our Getting Started page to access the most up-to-date information on the Interactive Sessions kernel: https://docs.aws.amazon.com/glue/latest/dg/interactive-sessions.html
Installed kernel version: 0.38.1 
Current idle_timeout is 2800 minutes.
idle_timeout has been set to 2880 minutes.
Setting Glue version to: 3.0
Previous worker type: G.1X
Setting new worker type to: G.1X
Previous number of workers: 5
Setting new number of workers to: 5
Authenticating with environment variables and user-defined glue_role_arn: arn:aws:iam::113506796683:role/awsglueFullRole
Trying to create a Glue session for the kernel.
Worker Type: G.1X
Number of Workers: 5
Session ID: be4a8fb4-21a8-483d-a6d0-d1c53c5712a3
Job Type: glueetl
Applying the following default arguments:
--glue_kernel_version 0.38.1
--enable-glue-datacatalog true
Waiting for session be4a8fb4-21a8-4

In [2]:
inventory_table=glueContext.create_dynamic_frame.from_options(connection_type="s3",
                                                              connection_options={"paths":['s3://python-demo-bucket737860/highestprices.csv']},
                                                              format="csv", format_options={"withHeader":True, "optimizePerformace":True})




In [3]:
inventory_table.printSchema()

root
|-- prod_id: string
|-- name: string
|-- price: string
|-- stock: string
|-- type: string


In [4]:
inventory_table.show()

{"prod_id": "1014", "name": "android watch", "price": "9988.0", "stock": "3", "type": "ELECTRONIC"}
{"prod_id": "1015", "name": "sony wf 1000 xm4", "price": "20999.0", "stock": "7", "type": "ELECTRONIC"}
{"prod_id": "1016", "name": "sony wh 1000xm4", "price": "19999.0", "stock": "2", "type": "ELECTRONIC"}
{"prod_id": "1017", "name": "mac", "price": "9999.0", "stock": "5", "type": "ELECTRONIC"}


In [5]:
inventory_table.toDF().show()

+-------+----------------+-------+-----+----------+
|prod_id|            name|  price|stock|      type|
+-------+----------------+-------+-----+----------+
|   1014|   android watch| 9988.0|    3|ELECTRONIC|
|   1015|sony wf 1000 xm4|20999.0|    7|ELECTRONIC|
|   1016| sony wh 1000xm4|19999.0|    2|ELECTRONIC|
|   1017|             mac| 9999.0|    5|ELECTRONIC|
+-------+----------------+-------+-----+----------+


In [9]:
from awsglue.transforms import Map
from awsglue.dynamicframe import DynamicFrame

def double_stock(record):
    record["new_stock"]=int(record["stock"])*2
    return record

new_dyf=Map.apply(frame=inventory_table, f=double_stock)




In [10]:
new_dyf.toDF().show()

+-------+-------+----------+---------+----------------+-----+
|prod_id|  price|      type|new_stock|            name|stock|
+-------+-------+----------+---------+----------------+-----+
|   1014| 9988.0|ELECTRONIC|        6|   android watch|    3|
|   1015|20999.0|ELECTRONIC|       14|sony wf 1000 xm4|    7|
|   1016|19999.0|ELECTRONIC|        4| sony wh 1000xm4|    2|
|   1017| 9999.0|ELECTRONIC|       10|             mac|    5|
+-------+-------+----------+---------+----------------+-----+


In [11]:
type(new_dyf)

<class 'awsglue.dynamicframe.DynamicFrame'>


In [13]:
new_dyf.printSchema()

root
|-- prod_id: string
|-- price: string
|-- type: string
|-- new_stock: int
|-- name: string
|-- stock: string


In [12]:
glueContext.write_dynamic_frame.from_options(frame=new_dyf,
                                            connection_type="s3",
                                            connection_options={"path":"s3://gluetest-1bucket"},
                                            format="csv", 
                                            format_options={"withHeader":True, "optimizePerformace":True, "seperator": ","},
                                            transformation_ctx="datauploaded")

<awsglue.dynamicframe.DynamicFrame object at 0x7f9bed28f8d0>
