
# Glue Studio Notebook
You are now running a **Glue Studio** notebook; before you can start using your notebook you *must* start an interactive session.

## Available Magics
|          Magic              |   Type       |                                                                        Description                                                                        |
|-----------------------------|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|
| %%configure                 |  Dictionary  |  A json-formatted dictionary consisting of all configuration parameters for a session. Each parameter can be specified here or through individual magics. |
| %profile                    |  String      |  Specify a profile in your aws configuration to use as the credentials provider.                                                                          |
| %iam_role                   |  String      |  Specify an IAM role to execute your session with.                                                                                                        |
| %region                     |  String      |  Specify the AWS region in which to initialize a session                                                                                                  |
| %session_id                 |  String      |  Returns the session ID for the running session.                                                                                                          |
| %connections                |  List        |  Specify a comma separated list of connections to use in the session.                                                                                     |
| %additional_python_modules  |  List        |  Comma separated list of pip packages, s3 paths or private pip arguments.                                                                                 |
| %extra_py_files             |  List        |  Comma separated list of additional Python files from S3.                                                                                                 |
| %extra_jars                 |  List        |  Comma separated list of additional Jars to include in the cluster.                                                                                       |
| %number_of_workers          |  Integer     |  The number of workers of a defined worker_type that are allocated when a job runs. worker_type must be set too.                                          |
| %worker_type                |  String      |  Standard, G.1X, *or* G.2X. number_of_workers must be set too. Default is G.1X                                                                            |
| %glue_version               |  String      |  The version of Glue to be used by this session. Currently, the only valid options are 2.0 and 3.0                                                        |
| %security_configuration     |  String      |  Define a security configuration to be used with this session.                                                                                            |
| %sql                        |  String      |  Run SQL code. All lines after the initial %%sql magic will be passed as part of the SQL code.                                                            |
| %streaming                  |  String      |  Changes the session type to Glue Streaming.                                                                                                              |
| %etl                        |  String      |   Changes the session type to Glue ETL.                                                                                                                   |
| %status                     |              |  Returns the status of the current Glue session including its duration, configuration and executing user / role.                                          |
| %stop_session               |              |  Stops the current session.                                                                                                                               |
| %list_sessions              |              |  Lists all currently running sessions by name and ID.                                                                                                     |

In [None]:
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
  
sc = SparkContext.getOrCreate()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)

Trying to create a Glue session for the kernel.
Worker Type: G.1X
Number of Workers: 5
Session ID: 9893b95c-54ac-4c02-9f4d-4126d53508f0
Applying the following default arguments:
--glue_kernel_version 0.30
--enable-glue-datacatalog true
Waiting for session 9893b95c-54ac-4c02-9f4d-4126d53508f0 to get into ready status...
Session 9893b95c-54ac-4c02-9f4d-4126d53508f0 has been created




In [3]:
%list_sessions

The first 1 sessions are:
9893b95c-54ac-4c02-9f4d-4126d53508f0


In [5]:
test_dynamicframe = glueContext.create_dynamic_frame.from_options( 's3',  {'paths': ['s3://bt-test-bck/input']}, 'csv', {'withHeader': True})
print("Count:",test_dynamicframe.count())
test_dynamicframe.printSchema()

Count: 20
root
|-- S.no: string
|-- col1: string
|-- col2: string
|-- col3: string


In [7]:
test_dynamicframe.show()

{"S.no": "1", "col1": "col1data1", "col2": "col2data1", "col3": "col3data1"}
{"S.no": "2", "col1": "col1data2", "col2": "col2data2", "col3": "col3data2"}
{"S.no": "3", "col1": "col1data3", "col2": "col2data3", "col3": "col3data3"}
{"S.no": "4", "col1": "col1data4", "col2": "col2data4", "col3": "col3data4"}
{"S.no": "5", "col1": "col1data5", "col2": "col2data5", "col3": "col3data5"}
{"S.no": "6", "col1": "col1data6", "col2": "col2data6", "col3": "col3data6"}
{"S.no": "7", "col1": "col1data7", "col2": "col2data7", "col3": "col3data7"}
{"S.no": "8", "col1": "col1data8", "col2": "col2data8", "col3": "col3data8"}
{"S.no": "9", "col1": "col1data9", "col2": "col2data9", "col3": "col3data9"}
{"S.no": "10", "col1": "col1data10", "col2": "col2data10", "col3": "col3data10"}
{"S.no": "1", "col1": "col1data1", "col2": "col2data1", "col3": "col3data1"}
{"S.no": "2", "col1": "col1data2", "col2": "col2data2", "col3": "col3data2"}
{"S.no": "3", "col1": "col1data3", "col2": "col2data3", "col3": "col3dat