# Using Python to interact with LUSID
This tutorial is designed to be a starting point for writing simple Python scripts in the Jupyter environment to interact with LUSID programmatically. The tutorial will cover:
- Using Jupyter
- Basic Python scripting
- Using Numpy to efficiently work with large, multi-dimensional data
- Using Pandas to work with dataframes
- Using the Python LUSID Standard Development Kit
- Using the Luminesce SDK
- Using the Lumipy SDK

## Basic Python scripting
This section of the tutorial serves as a brief introduction to Python and writing some Python code. We'll cover:
- What is Python
- Assigning variables and built-in types
- Python Sequence types
- Python Mapping types
- Decision statements
- Loops
- Functions
- Classes and objects

### What is Python?
Guido Van Rossum began developing Python in 1989 as a programming language that should be:
- An easy and intuitive language just as powerful as major competitors
- Open source, so anyone can contribute to its development
- Code that is as understandable as plain English
- Suitability for everyday tasks, allowing for short development times

Python is interpreted, which means that there is no complex build process for running your code, so you can write your code interactively. In Jupyter, you can write your code in a code cell and execute it by clicking the run button - its as simple as that!

### Assigning variables and built-in types
Here we'll learn how to create some variables, and some of the built-in types that Python provides
#### Assigning variables
Python is dynamically typed, so you don't have to declare the type of a variable before using it. By [convention](https://peps.python.org/pep-0008/#function-and-variable-names), multi-word variables are lower case, and seperated by underscores.

In [1]:
# This is a comment, it is ignored by the interpreter

# Assigning hello world to the variable x
# in other languages you would have to declare that x is a string before doing this
x = 'hello world'
x

'hello world'

In [3]:
# Assigning 7 to the variable y
# in other languages you would have to declare that y is an integer before doing this
y = 7
y

7

#### Built-in types
Here we'll introduce some of the built-in types that are provided out-of-the-box with Python:
- Numeric types - int, float
- The Text Sequence type - str
- Boolean Values
- The Null Object - None

##### Numeric types - int, float
In Python, integers are zero, positive or negative whole numbers without a fractional part and having unlimited precision.\
Floats are made to represent floating-point numbers with the same precision as the double type in other common languages.
You can use common mathematical operations with both floats and integers

In [7]:
# Some integers
a = 1
b = 2000
c = 0
d = -99

# Some floats
e = 0.1
f = -25.972

In [8]:
# adding ints
a + b

2001

In [9]:
# adding floats
e + f
# as f is negative its subtracted from e, as you would expect

-25.872

In [10]:
# some more complex math
2*(a+b)/e

40020.0

In [12]:
f = a + b
f + 1

2002

##### The Text Sequence type - str
Strings represent text data. In Python strings are immutable sequences of unicode characters:

In [14]:
# Assigning the string hello world to the variable x
x = 'hello world'
# Also assigning the string hello world to the variable x
x = "hello world"
# There are also multi-line strings:
y = '''So
many
lines!
'''
print(y)

So
many
lines!



Strings have many useful functions that can be used to manipulate them, here are some examples:

In [8]:
x = 'hello world'
# Return a copy of the string with its first character capitalized and the rest lowercased.
print(x.capitalize())
# Return a copy of the string with all the cased characters converted to uppercase
print(x.upper())
# Return True if all cased characters in the string are lowercase and there is at least one cased character, False otherwise.
print(x.islower())

Hello world
HELLO WORLD
True


f-strings are also pretty useful, allowing us to interpolate values into our strings:

In [12]:
name = 'Cage'
print(f'Thay call him {name}')

Thay call him Cage


##### Boolean Values
Python boolean values are either True or False (capitalized)

In [1]:
# assigning true to a variable
booleans_are_capitalized = True
booleans_are_capitalized

True

##### The Null object - None
None is a special object returned by functions (explained later) that don't explicitly return a value.

In [5]:
x = None
print(x)

None


### Python sequence types
Here we'll describe some of the Python types used to store collections of data:
- Lists
- Tuples
- Sets
- Range

#### Lists
Lists are the easiest way to store collections of values. Lists can accept heterogenous values, are variable in length, and are mutable.

In [12]:
# declaring a list
sample_list = [1,2,"three", True]
sample_list

[1, 2, 'three', True]

You can access elements in most Python sequences by slicing:

In [25]:
sample_list = [1,2,"three", True]
# Get the value at index 0
print(sample_list[0])
# Get the values from index 0 to index 2
print(sample_list[0:3])
# Get the value at the last index
print(sample_list[-1])
# Get all values in reverse order
print(sample_list[::-1])

1
[1, 2, 'three']
True
[True, 'three', 2, 1]


#### Tuples
Tuples are similar to lists, but they are immutable. Tuples can also be unpacked into several variables in a single line.

In [13]:
x = (1,2,True)
print(x[1])

# If a tuple has only one element it must have a comma
y = (1,)
print(y)

# Cannot change element in tuple as its immutable
try:
    x[1] = 7
except TypeError as exception:
    print(exception)
    
# tuple unpacking
tpl = (1,2,3)
a,b,c = tpl
print(a)
print(b)
print(c)

2
(1,)
'tuple' object does not support item assignment
1
2
3


#### Sets
Sets are also similar to lists, they just won't contain any duplicate values:

In [42]:
# Creating a set with duplicate 2 values:
x = {1,2,2,3}
print(x)

{1, 2, 3}


#### Range
Range is a sequence type that represents an immutable sequence of numbers. Its normally used to generate numbers used in for loops:

In [10]:
# from 0 to 9
print(list(range(10)))
# from 9 to 0
print(list(range(9, -1, -1)))
# even numbers to 10
print(list(range(2, 10, 2)))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
[2, 4, 6, 8]


### Python Mapping Types
dict is Python's built-in mapping type. Dicts store key-value pairs where keys must be [hashable](https://docs.python.org/3/glossary.html#term-hashable), and values can be any type. It is usually much easier, and more efficient to save and fetch values from dictionaries than from Sequence types.

In [46]:
dictionary = {'banana':True, 7:'hello_world'}
print(dictionary)

print(dictionary['banana'])
dictionary['apple'] = False
print(dictionary)

{'banana': True, 7: 'hello_world'}
True
{'banana': True, 7: 'hello_world', 'apple': False}


### Decision statements
Python allows you to control the flow of the program - the simplist way by using 'if' statements

In [1]:
# try changing the value of expression to see what this code does!
expression = True
# if expression evaluates to True then print hello world
# otherwise do nothing
if expression:
    print('hello world')

hello world


In [3]:
# try changing the value of expression to see what this code does!
value = 1
# if expression evaluates to True then print value is 0
# otherwise continue to elif statement
if value == 0:
    print('value is 0')
# if the value is 1
# print value is 1
elif value == 1:
    print('value is 1')


value is 1


In [4]:
# try changing the value of expression to see what this code does!
value = 'banana'
# if expression evaluates to True then print value is 0
# otherwise continue to elif statement
if value == 0:
    print('value is 0')
# if the value is 1
# print value is 1
# otherwise continue to next elif statement or else statement if no more elifs.
elif value == 1:
    print('value is 1')
else:
    print('banana')


banana


### Loops

Python has 2 main types of loop:
 - For loops.\
For loops work by by iterating over a sequence of values. Its common to combine the range sequence type with a for loop, but any sequence type can be used:

In [17]:
# print the numbers 1 - 9
for i in range(10):
    print(i)
    
animals = ['cow', 'sheep', 'alligator']
for animal in animals:
    print(animal)

# tuple unpacking is commonly used in for loops:
# here enumerate returns a sequence of tuples
# where each tuple contains an animal and its index in the list
for index, animal in enumerate(animals):
    print(index)
    print(animal)

habitats = ['farms', 'fields', 'swamps']
# zip can be used to loop over two lists simultaneously:
for animal, habitat in zip(animals, habitats):
    print(animal)
    print(habitat)
    

0
1
2
3
4
5
6
7
8
9
cow
sheep
alligator
0
cow
1
sheep
2
alligator
cow
farms
sheep
fields
alligator
swamps


- While loops \
While loops check an expression every iteration, and only break out of the loop when the expression evaluates to false.

In [21]:
i = 0
# print the numbers 1 - 10 then stop 
while(i<10):
    print('running')
    i+=1
    print(i)
print('stopped')

running
1
running
2
running
3
running
4
running
5
running
6
running
7
running
8
running
9
running
10
stopped


Common to both for loops and while loops are the _break_ and _continue_ keywords.
_break_ causes the program to exit the loop.
_continue_ causes the current iteration of the loop to end, and the next to begin.

In [24]:
# print 0 - 3 then exit loop
for i in range(10):
    print(i)
    if i == 3:
        break
        
# print 0 - 9 , but skip the number 3
for i in range(10):
    if i==3:
        continue
    print(i)

0
1
2
3
0
1
2
4
5
6
7
8
9


### Functions
Functions are a great way of re-using code.

In Python, you can declare "arguments", which are variables which the function uses. Functions will also return a value, using the return statement, which allows you to use the result of some function call later in your code.
If you don't provide a return statement, the function returns None.

In [6]:
# a fairly useless function that just prints something to the terminal
def print_flux_capacitor():
    print('flux capacitor')
    
# call the function
print_flux_capacitor()

# print the sum of 2 numbers:
def print_sum(a, b):
    print(a+b)

# prints 3
print_sum(1,2)
# pass a predefined variable to a function
x = 7
# prints 8
print_sum(1, x)

# return the sum of two numbers
def return_sum(a, b):
    return a + b
    
# store result of function in a variable
result = return_sum(1,2)
print(result)

flux capacitor
3
8
3


### Classes and objects
Generally, in larger peices of code, we model things as "objects" which use "attributes" to hold the state of the object, and behaviours(functions) that the objects express.

Classes define what attributes and behaviours an object can have. For example, we might define a Person class, which says that people have heights, eye colours and can walk. An object in this example would be my friend Shawn, who is 180cm tall, has blonde hair and will walk to the local cafe every so often.

Let's create a person class, and some people objects:

In [14]:
# defining our class
class Person:
    # __init__ function is a constructor in python - it's how we initialise our Person object and set initial attribute values
    # All object functions will include a self parameter, which allows use to access attributes and behaviours of our object
    def __init__(self, height_in_cm, hair_colour):
        self.height_in_cm = height_in_cm
        self.hair_colour = hair_colour
        # we'll also set a current location attribute that defaults to home
        self.location = 'home'
    # lets say our people can walk to a location
    def walk(self, location):
        self.location = location
    # We'll also add a function that tells Python how to print our object in a human readable way
    # this is called a dunder function, which is outside of the scope of this course.
    def __repr__(self):
        return f'Person(height_in_cm:{self.height_in_cm}, hair_colour:{self.hair_colour}, location:{self.location})'

# Let's model my friend Shawn:
shawn = Person(180, 'blonde')

# let's see what shawn looks like:
print(shawn)
# we'll send shawn to grab a coffee - he should move to the cafe:
shawn.walk('cafe')
print(shawn)

# we can also access attributes individually 
print(shawn.location)

# or create someone else:
dean = Person(195, 'brunette')
print(f'Dean is a {dean}')

Person(height_in_cm:180, hair_colour:blonde, location:home)
Person(height_in_cm:180, hair_colour:blonde, location:cafe)
cafe
Dean is a Person(height_in_cm:195, hair_colour:brunette, location:home)


## Using Numpy to efficiently work with large, multi-dimensional data
This part of the course will serve as a short introduction to numpy, a widely used Python module which leverages C code to efficiently process multi-dimensional arrays of data. You don't need to learn C to write efficient code, numpy takes care of this without any effort from the programmer!

This part of the course will cover:
 - What are numpy arrays
 - Creating numpy arrays
 - Basic vectorized operations
 - Indexing, slicing and iterating

### What are Numpy arrays?
While a Python list can contain different data types within a single list, in order to improve efficience, all of the elements in a NumPy array should be homogeneous.

An array is a grid of values and it contains information about the raw data and how to locate an element. The elements are all of the same type, referred to as the array dtype. Note, this will not be one of the built-in Python types.
 
The rank of the array is the number of dimensions. The shape of the array is a tuple of integers giving the size of the array along each dimension.

### Creating numpy arrays

Let's create some numpy arrays and explore their structure:


In [11]:
# lets print the arrays we create, along with array metadata
def describe_np(aray):
    # tuple of integers giving size of the array across each dimension
    print(f'shape: {aray.shape}')
    # rank
    print(f'rank: {aray.ndim}')
    # type of data
    print(f'dtype: {aray.dtype.name}')
    # size of each entry in array
    print(f'itemsize: {aray.itemsize}')
    # number of elements in array
    print(f'size: {aray.size}')
    print('-'*20)
    print(f'a:{aray}')
    print('-'*20)
    

In [12]:
import numpy as np

# create a numpy array with numbers from 0 to 15
# of rank 2
# with the first dimension of size 3,
# and second dimension of size 5

a = np.arange(15).reshape(3, 5)
describe_np(a)



shape: (3, 5)
rank: 2
dtype: int64
itemsize: 8
size: 15
--------------------
a:[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]
--------------------


In [14]:
# create np array from Python array
a = np.array([2, 3, 4])
describe_np(a)

shape: (3,)
rank: 1
dtype: int64
itemsize: 8
size: 3
--------------------
a:[2 3 4]
--------------------


In [15]:
b = np.array([1.2, 3.5, 5.1])
# notice dtype is float64
describe_np(b)

shape: (3,)
rank: 1
dtype: float64
itemsize: 8
size: 3
--------------------
a:[1.2 3.5 5.1]
--------------------


In [18]:
# numpy will auto convert sequences of sequences to 2 dim array
# this also applies with more nested sequences in higher dimenstions
c = np.array([(1.5, 2, 3), (4, 5, 6)])
describe_np(c)

shape: (2, 3)
rank: 2
dtype: float64
itemsize: 8
size: 6
--------------------
a:[[1.5 2.  3. ]
 [4.  5.  6. ]]
--------------------


In [21]:
# we can define the type when we create an array
d = np.array([1, 2], dtype=np.float64)
describe_np(d)

shape: (2,)
rank: 1
dtype: float64
itemsize: 8
size: 2
--------------------
a:[1. 2.]
--------------------


In [24]:
# we can use zeros to create an array filled with zeros
# we have to pass the shape of the array to the zeros fn
e = np.zeros((2,3))
describe_np(e)

shape: (2, 3)
rank: 2
dtype: float64
itemsize: 8
size: 6
--------------------
a:[[0. 0. 0.]
 [0. 0. 0.]]
--------------------


In [25]:
# ones does the same
f = np.ones((2,3))
describe_np(f)

shape: (2, 3)
rank: 2
dtype: float64
itemsize: 8
size: 6
--------------------
a:[[1. 1. 1.]
 [1. 1. 1.]]
--------------------


In [26]:
# we can use arange to create a sequence of integers - similar to range in Python
g = np.arange(4)
describe_np(g)

shape: (4,)
rank: 1
dtype: int64
itemsize: 8
size: 4
--------------------
a:[0 1 2 3]
--------------------


In [27]:
# use linspace to do the same for floating point sequences
h = np.linspace(0, 2, 9) # 9 numbers from 0 to 2
describe_np(h)

shape: (9,)
rank: 1
dtype: float64
itemsize: 8
size: 9
--------------------
a:[0.   0.25 0.5  0.75 1.   1.25 1.5  1.75 2.  ]
--------------------


### Basic vectorized operations
Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result.

In [31]:
a = np.array([20, 30, 40, 50])
b = np.arange(4) #[0, 1, 2, 3]
# returns a new array
# with values:
# [
# 20 - 0
# 30 - 1
# 40 - 2
# 50 - 3
#]
c = a - b
describe_np(c)
print('')
describe_np(b**2)
print('')
describe_np(a < 35)

shape: (4,)
rank: 1
dtype: int64
itemsize: 8
size: 4
--------------------
a:[20 29 38 47]
--------------------

shape: (4,)
rank: 1
dtype: int64
itemsize: 8
size: 4
--------------------
a:[0 1 4 9]
--------------------

shape: (4,)
rank: 1
dtype: bool
itemsize: 1
size: 4
--------------------
a:[ True  True False False]
--------------------


Many unary operations, such as computing the sum of all the elements in the array, are implemented as methods of the ndarray class.

In [34]:
a = np.array([20, 30, 40, 50])
print(a.sum())
print(a.min())
print(a.max())

140
20
50


In [35]:
# sum along one dimension
b = np.ones((2,3))
# sum all values across first dimension
describe_np(b.sum(axis = 0))

shape: (3,)
rank: 1
dtype: float64
itemsize: 8
size: 3
--------------------
a:[2. 2. 2.]
--------------------


### Indexing, slicing and iterating
One-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python sequences.

In [37]:
a = np.arange(10)
describe_np(a[2])
print('')
describe_np(a[2:5])

shape: ()
rank: 0
dtype: int64
itemsize: 8
size: 1
--------------------
a:2
--------------------

shape: (3,)
rank: 1
dtype: int64
itemsize: 8
size: 3
--------------------
a:[2 3 4]
--------------------


Multidimensional arrays can have one index per axis. These indices are given in a tuple separated by commas:

In [42]:
b = np.array([[1,2,3],[4,5,6]])
describe_np(b)
print('')
describe_np(b[0,0])
print('')
# all values in column 1
describe_np(b[:, 1])

shape: (2, 3)
rank: 2
dtype: int64
itemsize: 8
size: 6
--------------------
a:[[1 2 3]
 [4 5 6]]
--------------------

shape: ()
rank: 0
dtype: int64
itemsize: 8
size: 1
--------------------
a:1
--------------------

shape: (2,)
rank: 1
dtype: int64
itemsize: 8
size: 2
--------------------
a:[2 5]
--------------------


Iterating over multidimensional arrays is done with respect to the first axis:



In [44]:
b = np.array([[1,2,3],[4,5,6]])
for index, row in enumerate(b):
    print(f'row {index}: {row}')

row 0: [1 2 3]
row 1: [4 5 6]


or we can use the flat attribute to iterate over all elements in an array

In [45]:
b = np.array([[1,2,3],[4,5,6]])
for index, element in enumerate(b.flat):
    print(f'element {index}: {element}')

element 0: 1
element 1: 2
element 2: 3
element 3: 4
element 4: 5
element 5: 6


## Using Pandas to work with dataframes

We'll explore the widely used pandas library in this part of the course. We'll cover:
- Intro to pandas
- Structure of a dataframe
- Creating dataframes
- Viewing dataframes
- Slicing dataframes

### Intro to pandas
Pandas aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language.

Most of the functionality of pandas is delivered in Dataframes and Series'.

### Structure of a dataframe
A DataFrame is a 2-dimensional data structure that can store data of different types (including characters, integers, floating point values, categorical data and more) in columns.

In [3]:
import pandas as pd
df = pd.DataFrame([{'name':'Nickols', 'height in cm':123, 'hair colour': 'red'},
                   {'name':'Benjals', 'height in cm':200, 'hair colour': 'black'},
                   {'name':'Dennisons', 'height in cm':180, 'hair colour': 'brunette'}])
df.head()

Unnamed: 0,name,height in cm,hair colour
0,Nickols,123,red
1,Benjals,200,black
2,Dennisons,180,brunette


## Using the Python LUSID Standard Development Kits

## Using the Lumipy SDK