# Getting Started with DataStage: Authenticating

<div>
<img src="./datastage-icon.svg" width="300" align="Left"/>
</div>



Estimated time needed: **15** minutes

## Objectives

After completing this lab you will be able to:

*   Authenticate with DataStage using IBM Cloud
*   Create a DataStage Service Instance
*   List your project's flows to verify service connection

**Verify Installation of DataStage for Python 3.x**:

In [None]:
!pip3 install --upgrade "datastage>=0.0.1"

## Authenticating with DataStage through IBM Cloud

The datastage library is built on top of IBM's Cloud SDK, which looks to environment variables to provide the necessary values for authentication. By setting your authorization type, api key, datastage url, authorization url, and your project id all as environment variables, you can directly run datastage commands and the SDK will handle authentication in the background.

In [None]:
# import os before we get started
import os

#### 1) View your current environment variables:
Running this cell will output a list of all current environment variables on your system.

In [None]:
os.environ

#### 2) Set the necessary environment variables either (a) explicitly or (b) using a credentials file
Doing so will allow the DataStage API to reference environment variables for authentication.

>(a) Explicitly declare environment variables

In [None]:
# Insert your API key below and run this cell to set environment variables explicitly
os.environ['DATASTAGE_APIKEY'] = 'YOUR_API_KEY'
# the three variables below are the same for all users using iam authentication
os.environ['DATASTAGE_AUTH_TYPE']='iam'
os.environ['DATASTAGE_URL']='https://api.dataplatform.cloud.ibm.com/data_intg'
os.environ['DATASTAGE_AUTH_URL'] = 'https://iam.cloud.ibm.com/identity/token'

>(b) Declare environment variables in a credentials file

If you'd prefer to store your authentication information in a credentials file, create a file in your working directory titled 'credentials.env', then copy and paste the following lines into credentials.env after replacing the placeholder with your api key:

DATASTAGE_APIKEY=YOUR_API_KEY
DATASTAGE_AUTH_TYPE=iam
DATASTAGE_URL=https://api.dataplatform.cloud.ibm.com/data_intg
DATASTAGE_AUTH_URL=https://iam.cloud.ibm.com/identity/token

To load the contents of the credentials file, uncomment the following two lines and run the cell:

In [None]:
#filename = 'credentials.env'
#os.environ['IBM_CREDENTIALS_FILE'] = filename

#### 3) Declare your project ID environment variable

To find your project ID, visit https://dataplatform.cloud.ibm.com/ where you can:

i. Log in to your Cloud Pak for Data account\
ii. Click on 'Projects' under 'Quick Navigation' on the left side of the screen\
iii. Select an existing project, or create a project from scratch\
iv. Once you're in a project, copy the link from your browser (like the one below) into the project_dashboard_url variable in the next cell

>Example project dashboard link: https://dataplatform.cloud.ibm.com/projects/68c271ee-a23d-4a7c-bef2-8c5f6b495d67/overview?context=cpdaas (do not use)

In [None]:
# copy project dashboard link here
project_dashboard_url = '<link to the dashboard of the project you are working on>'

In [None]:
from urllib.parse import urlparse

#parse the url
parsed_url = urlparse(project_dashboard_url)
#extract the project id from the url's path
project_id = parsed_url.path.split(sep='/')[2]

print("Your Project ID:", project_id)

Now, you can explicity declare the project ID's environment variable:

In [None]:
#explicitly declare your project ID as an environment variable
os.environ['DATASTAGE_PROJECT_ID'] = project_id

#### 4) Verify Your Authentication Variables
Check the output of the following command to verify that the datastage authentication variables (or the name of your credentials file) were/was added to your environment:

In [None]:
os.environ

## Create a service instance

Pass the service name 'DATASTAGE' as an argument to DatastageV3.new_instance() so the service can properly find the environment variables corresponding to your DataStage service

In [None]:
import os
from datastage.datastage_v3 import *

service_name = 'DATASTAGE'

#create a datastage instance
#this will automatically reference your credentials from the environment
datastage_service = DatastageV3.new_instance(service_name)

#check if the instance was created properly
if datastage_service is not None:
    print("DataStage Client Instance Created\n")
    print("Your DataStage service is ready for use")
else: 
    print("Error creating DataStage client - make sure to set your environment variables for authentication and provide a service name when calling a new instance")

## Get a list your project flows to verify DataStage service connection
Use DataStage's *list_datastage_flows()* function to list your projects flows. Make sure to explicityly assign the project_id argument to your project's ID as done below, and select a limit for the number of flows to return (the limit of 100 was selected arbitrarily). 

The try/except block is present so that if the connection is aborted from having this notebook open too long, 
the exception will run the code again which will re-establish connection

In [None]:
#attempt to list datastage flows
try:
    project_flows = datastage_service.list_datastage_flows(
        project_id=os.environ['DATASTAGE_PROJECT_ID'], limit=100).get_result()
#this will re-run if there's a connection error due to a timeout
except ConnectionError:
    project_flows = datastage_service.list_datastage_flows(
        project_id=os.environ['DATASTAGE_PROJECT_ID'], limit=100).get_result()

#print your flow's details in a nested json format
print(json.dumps(project_flows, indent=2))

## Want to learn more?

Check out the DataStage Python API Documentation [here](https://cloud.ibm.com/apidocs/datastage?code=python)

Connect with the IBM DataStage Community [here](https://community.ibm.com/community/user/dataops/communities/community-home?CommunityKey=3bfc9f2f-4a5e-470a-9295-4b7cc90c9518)

### Thank you for completing this lab!
Be on the lookout for more posts in this tutorial series that will cover using the API to manipulate flows & sub flows, a

## Author

Alexander Polus

## Change Log

| Date (YYYY-MM-DD) | Version | Changed By | Change Description                 |
| ----------------- | ------- | ---------- | ---------------------------------- |
|                   |     1.0 |  Alexander |                    Initial Staging |
|                   |         |            |                                    |

## <h3 align="center"> © IBM Corporation 2021. All rights reserved. <h3/>