<center><img src="https://www.research.va.gov/naii/SPCommunityBanner700x350.png"/></center>
<br>
<br>

# AI @ VA Python for Beginners
#### A Gentle Introduction to Interactive Python (IPython) Notebooks and the Python Programming Language
<p>By Tim Strebel</p>



### Section 1: Introduction

Hello and welcome to my beginner's tutorial on IPython notebooks. My name is Tim Strebel. I am a <strong>data scientist</strong> and Artificial Intelligence (AI) Program Supevisor at the Washington DC VA. I want to extend a warm welcome to everyone to the <a href=https://www.research.va.gov/naii/join.cfm>AI @ VA community</a>.


To kick off our AI @ VA community, lunch and learn series, I wanted to create a Python tutorial that is useful and accessible to people all skill levels. 

In this tutorial, we will explore some basic concepts of **Artificial Intelligence**, **interactive (IPython) python notebooks** and the ****Python programming language**.


To start, **Artificial Intelligence (AI)** is a computer science discipline that aims to leverage machines to perform tasks commonly associated with intelligent beings <a href=https://www.britannica.com/technology/artificial-intelligence>source</a>. At the very basis of AI technology is **machine learning** which is the use of algorthms and statistics to find patterns in large datasets. 

When datasets become quite large, the term **"big data"** is commonly used to describe these datasets which can be used for **deep learning**. You can think of deep learning as machine learning on steroids. It uses techniques such as **deep neural networks** which are very computationally large models that can capture intricate and complex patterns in large datasets.

<center>
<figure>
<img src="https://miro.medium.com/max/1400/0*1DqTRU7WREONm9oa.png"/>
<figcaption>Neural Network Architecture - Source: <a href="https://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=4,2&seed=0.51581&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false">https://playground.tensorflow.org</a></figcaption">
</figure>
</center>

There are many programming languages that are commonly used to develop AI models and systems. 

While the best language for AI is still a conversation of great debate, there are many reasons to prefer Python above all others. 

As an avid Python programmer and advocate for the langage, it is my go-to recommendation for anyone wanting to practice machine and deep learning for many reasons:
<ul>
    <li>Its easy to learn and use</li>
    <li>Is open source</li>
    <li>Has an active community that provides robust support through sites like Stack Overflow and Github</li>
    <li>Has well maintained documentation for its core and other libraries</li>
    <li>Has fully-matured libraries for natural languate processin (NLP), mathematics, data mining, data manipulation and data visualization which are foundational AI techniques
    <ul>
        <li>Packages include <strong>Spacy, NLTK, Numpy, Scipy, Weka, Mlxtend, Pandas, Matplotlib, Plotly</strong> and more.</li>
    </ul>
    </li>
    <li>Has state-of-the art libraries for machine and deep learning
    <ul>
        <li>Packages include <strong>Scikit-learn, Theano, Keras, Tensorflow and PyTorch.</strong></li>
    </ul>
    </li>
</ul>
<br>
<center>
    <table>
        <tr>
            <td><img src="https://steemitimages.com/640x0/https://cdn.steemitimages.com/DQmTgS1uQwEPqnobt4LEctbZP2dsqPnPC3jEw2EgyeLqz1H/image.png"/></td>
            <td><img src="https://steemitimages.com/640x0/https://cdn.steemitimages.com/DQmSymJNoerFg7KJoxjaa7vJoK1iKC57PbS4XnTkXGEgoqD/image.png"/></td>
        <tr>
        <tr>
            <td><p>Source: <i><a href="https://steemit.com/technology/@aitech/best-programming-languages-for-artificial-intelligence"> https://steemit.com/</a></i></p></td>
        <tr>
    </table>
</center>

Python can be downloaded from <a href=https://www.python.org/>https://www.python.org/</a>. 

Once downloaded you can create python files in an <strong>integrated development environment (IDE)</strong> or in a <strong>notepad</strong> document. 

Python files are saved by appending the <i>.py</i> file extension and can be executed from the command line or by clicking on the file in Windows or Linux assuming you have properly set the environment path variable (<a href=https://docs.python.org/3/using/windows.html>see section 4.6 of this link</a>). 

You can also schedule .py files to run on your computer via the Windows Task Scheduler. This is handy for automating daily tasks on your computer such as populating an excel spreadsheets or updating information on a website. 

*For security purposes, you should never execute .py files on your computer that are from unknown sources.*

Python on its own can be a very powerful tool, but when combined with **interactive Python notebooks** (AKA **IPython notebooks**), the effects are synergistic. 

In interactive notebooks you can 
* enter chunks of **Python code** into cells that can be executed in sequence, visually inspect the output(s) of your code, use **markup** languages to create a **computational narrative** (like I am currently doing), visualize the results of data analysis by using **data visulization** libraries and **collaborate** with other data scientists and machine learning engineers on AI projects.

Interactive notebooks are saved by appending the the *.ipynb* file extension. While there are many different IDE flavors to **save/load/edit** interactive Python notebooks, the .ipynb file at its core is nothing more than a *.json* file which is a flexible, widely recognized storage format.
 
Some of my favorite free IDE's that support IPython notebooks are <a href=https://jupyter.org/>Jupyter</a>, <a href=https://code.visualstudio.com/>Visual Studio Code</a>, <a href=https://deepnote.com/>Deepnote</a>

### Section 2: Markdown Cells

There are two types of cells in an IPython notebook, <strong>code</strong> and <strong>markup</strong>.
<ul>
    <li>Code cells allow you to edit python code and display outputs.</li>
    <li>Markdown cells allow you to imbed markup tags to display text, images, headers and even mathematical notation.
    <li>There are at least four types of markup/down syntaxes that IPython notebooks support.</li>
    <ol>
        <li><a href=https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html#Embedded-code>Jupyter Markdown</a></li>
        <li><a href=https://latex-tutorial.com/tutorials/amsmath/#:~:text=LaTeX%20math%20and%20equations%201%20Using%20inline%20math,in%20math%20mode%20%E2%80%93%20Scaling.%20...%206%20Summary>LaTeX</a> equations</li>
        <li><a href=https://github.github.com/gfm/>Github Flavored Markdown (GFM)</a></li>
        <li><a href=https://www.w3schools.com/html/>Hyper-text Markup Language (HTML)</a> (support for some tags may be limited)</li>
    </ol>
    </li>
</ul>

Go ahead and <strong>double-click</strong> on the cell to unrender the markdown and view the different IPython syntaxes. When you are done, use the <strong>ctrl+Enter</strong> hotkey to re-render the cell.

#### Jupyter Markdown Examples

**Headings**

# Heading 1
## Heading 2
### Heading 3
#### Heading 4

**Bold Text**

**this is bold text** 

**Italicize Text**  

*This is italicized text*

**Ecape Characters**

\*literal asterisks\*
 *literal asterisks*

**Order List**  

1. One
   1. Sublist level one
   1. Sublist level one
       1. Sublist level two
       1. Sublist level two
1. Two

**Unordered List**  

* One
  * Sublist level one
    * Sublist level two
* Two

**Tex Line Breaks** (use two spaces after each line of text to start a new line)  

Two households, both alike in dignity,  
In fair Verona, where we lay our scene,  
From ancient grudge break to new mutiny,  
Where civil blood makes civil hands unclean.

**Image Embedding**  

![image of a cat sitting on a chair](https://media.istockphoto.com/photos/all-paws-down-im-the-cutest-picture-id859654404?b=1&k=20&m=859654404&s=170667a&w=0&h=jXyfpUF1n_Ad-ES_fB6XLx_flRC-VUIUtRTTLe_DKTE=)

### Latex Equation Examples  

**Manhattan Distance**  
$$ d(\vec{X}, \vec{Y}) = \sum_i | x_i - y_i| $$

**Euclidean Distance**  
$$ d(\vec{X}, \vec{Y}) = sqrt{\sum_i(x_i-y_i)^2} $$

**Cosine Similarity**  
$$ \cos(\vec{X}, \vec{Y}) = frac{\sum_ix_iy_i}{\sqrt{\sum_ix_i^2}\cdot\sqrt{\sum_iy_i^2}} $$

### Github Flavored Markdown (GFM)

**Quoting Text**  

Text that is not a quote

> Text that is a quote

**Quoting Code**  

Some basic Git commands are:
```
git status
git add
git commit
```

**Footnotes**  

Here is a simple footnote[^1].

A footnote can also have multiple lines[^2].  

You can also use words, to fit your writing style more closely[^note].

[^1]: My reference.
[^2]: Every new line should be prefixed with 2 spaces.  
  This allows you to have a footnote with multiple lines.
[^note]:
    Named footnotes will still render with numbers instead of the text but allow easier identification and linking.  
    This footnote also has been made with a different syntax using 4 spaces for new lines.

**A GFM Table**  

| First Header  | Second Header |
| ------------- | ------------- |
| Content Cell  | Content Cell  |
| Content Cell  | Content Cell  |

### Hyper-text Markup Language (HTML)

<h1>Heading 1</h1>
<h2>Heading 2</h2>
<h3>Heading 3</h3>
<h4>Heading 4</h4>

**Bold Text**

<strong>this is bold text</strong>

**Italicize Text**  

<i>This is italicized text</i>

**Order List**  

<ol>
  <li>One
   <ol>
      <li>Sublist level one
      <ol>
        <li>Sublist level two
      </ol></li>
  </ol></li>
  <li>Two</li>
</ol>

**Unordered List**  

<ul>
  <li>One
   <ul>
      <li>Sublist level one
      <ul>
        <li>Sublist level two
      </ul></li>
  </ul></li>
  <li>Two</li>
</ul>

**HTML Table**

<table>
  <tr>
    <th><strong>First Header</strong></th>
    <th><strong>Second Header</strong></th>
  </tr>
  <tr>
    <td>Content Cell</td>
    <td>Content Cell</td>
  </tr>
  <tr>
    <td>Content Cell</td>
    <td>Content Cell</td>
  </tr>
</table>

**Image Embedding**  

<img height=150 src="https://media.istockphoto.com/photos/all-paws-down-im-the-cutest-picture-id859654404?b=1&k=20&m=859654404&s=170667a&w=0&h=jXyfpUF1n_Ad-ES_fB6XLx_flRC-VUIUtRTTLe_DKTE="/>

**Centered Image**  

<center><img height=150 src="https://media.istockphoto.com/photos/all-paws-down-im-the-cutest-picture-id859654404?b=1&k=20&m=859654404&s=170667a&w=0&h=jXyfpUF1n_Ad-ES_fB6XLx_flRC-VUIUtRTTLe_DKTE="/></center>

### Section 3: Python Tutorial

In IPython **code cell**s, you can easily write, edit and execute python code within an IPython Notebook. This capability is great project collaboration and unit-testing chunks of code. To execute a code cell in an IPython notebook, simply click on the cell and use the **ctrl+Enter** hotkey. When executing cells, you can display the output of code using the Python native ```print()``` function. Go ahead and execute the following code cell to print the ```"Hello, World!"```. 

\**An in-depth overview of functions is outside of the scope of this notebook but perhaps we can explore them in another lunch and learn session.*

In [None]:
print('Hello, World!')

When writing code cells, it's also a good practice to use comments to guide collaborators through your code. Comments can be created by prefixing lines in code cells with a **#** (AKA pound or hash). You can also quickly **comment/uncomment** lines of code within a cell by using the **ctrl+/** hotkey.

In [None]:
# The following line of code is commented out and will not execute
# print('I will not execute unless you uncomment me!')

# The following line of code will execute
print('I am not commented and will execute!')

#### Python Variables
In python ***variables*** are used to store **data structures, types,** and **functions**. 

Variables can be declared using the ***assignment*** ```=``` operator. You simply assign a variable by assigning a value to a name in code. 

Once assigned, you can **print** the output of a **variable** by placing it withing the parentheses of the ```print()``` function.

In [None]:
x = 'Hello, World!'

print(x)

Variable inputs assinged in code cells can be referenced in other code cells throughout the IPython notebook. Also, in IPython code cells, you don't need to use the ```print()``` function to display variable inputs. 

You can save time by just placing the variable at the bottom of the cell and then executing it. 

One caveat about this is that depending on the input, the output may be different. Execute the following line of code to display the contents of the variable ```x``` defined in the previous cell.

In [None]:
x

The contents of variables can also be deleted using the Python ```del()``` function. The following code with cause an error after we try to print a variable that has been deleted. 

In [None]:
x = 'Hello, World!'

del(x)

print(x)

**Variable names** have a set of **constraints** in python. They can be of any length, can consist of upper/lowercase characters **(A-Z, a-z)**, digits **(0-9)**, and the underscore ```_``` character. An additional constraint is that the first character of a variable name cannot contain a digit.

In [None]:
_this_is_a_valid_variable_name = "Hello, World!"
_99_this_is_also_valid_variable_name_77 = "Hello, World!"
9this_is_an_invalid_variable_name = "Hello, World"

Asside from variable name constraints, there are **idiomatic** or **Pythonic** naming conventions that I highly recommend learning. 

If you're interested in learning more about idomatic python code, I suggest checking out <a href=https://zen-of-python.info/><i>The Zen of Python</i></a>. 

Idiomatic coding is a **best practice** in computer science. It helps ensure that your code is readable and understandable by a broader audience of coders making collaboration easier.

Two **idiomatic** guidelines for naming conventions are:
* **Variable names** should be lowercase with words separated by undescores i.e. these_words_are_separated_by_underscores.
* The same rule with variable names applies to function names


### Python Primitive Data Types

This next section will cover **primitive data types** (AKA primitive type). 

A full understanding of **types** is beyond the scope of this tutorial. The key takeaway is that python has a set of primitive data types that are foundational to the programming language and do not require the user download additional code **libraries** to access them . The four primitive types are:

* String
* Integers
* Float
* Boolean

#### Strings

Strings in python can be declared by surrounding a **sequence of characters** with a pair of **single ('')** or **double ("")** quotes. For example, ```'Hello, World!'``` is a string and is the same as ```"Hello, World!"```.

In [None]:
# The following lines of code are the same
print('Hello, World!')
print("Hello, World!")
# You can also print single and double quotation marks within a string
# as long as you do not use the same quotation markes i.e. double quotes 
# within double, single within single ect.
print('Double Quotes', '"Hello, World!"')
print('Single Quotes', "'Hello, World!'")

Two or more strings can be concatenated together by using the ```+``` operator.

In [None]:
hello_world = 'Hello' + ',' + ' World' + '!'

print(hello_world)

**String interpolation** is the process of injecting a value into a placeholder withing a string. these placeholders are interchangable and can be dynamic by using **variables**. Python provides a few different methods for string interpolation.
* The **modulo** ```%``` operator
* The ```string.format()``` method using brackets ```{}``` as placeholders
* ```f'{}'``` **f-string** formating using  using brackets ```{}``` as placeholders

In [None]:
# Modulo method example
print('%s, %s!'%('Hello', 'World'))

# string.format() method examples
print('{}, {}!'.format('Hello', 'World'))
print('{0}, {1}!'.format('Hello', 'World'))
print('{1}, {0}!'.format('Hello', 'World'))

# f-string method example
hello = 'Hello'
world = 'World'
print(f'{hello}, {world}!')



**Special characters** in Python strings are a sequence of characters that when printed, output encodings that are different from the string representations. Most special characters begin with a **backslash** ```\```. The most basic exampe of a special character is the new-line character ```\n``` which starts a new line within a string. Here are some examples.

In [None]:
# New line character
print('Hello,\nWorld!')
# Tab character
print('Hello,\tWorld!')
# Emojis
print('\U0001f600 \U0001f604 \U0001F606 \U0001F923')

The **Escape sequence character** for Python is backslash ```\```. Backslash will escape a special character sequence. A **string literals** are a strings prepended with an ```r```. String literals escape all special character sequences. See below.

In [None]:
# This prints a single backslash
print('\\')
# This will not print a new-line character between hello and world
print('Hello,\\nWorld!')
#This is how you use a single quote within a single-quote string.
print('\'')
# Here is a string literal. What once printed emojis
# now prints their character sequences.
print(r'\U0001f600 \U0001f604 \U0001F606 \U0001F923')

We'll end our tutorial on strings here. This tutorial is not exhaustive. Some topics of further exploration are: **string methods**, **unicode strings**, **string slicing** and **Regular Expressions (RegEx)**.

#### Integers
**Integers** in python are used to represent whole numbers. We can use integers to represent vectors and carry out mathematical operations. We cary out mathematical operations with the use of **operators**. 

Operators are a topic of deeper discussion that is outside of the scope of this tutorial. 

Lets look a some common mathematical operators in Python that we can use to carry out mathematical operations with integers.
* The **addition operator** ```+``` is used to find the **sum** of two integers.
* The **subtraction** ```-``` is used to find the **difference** between two integers.
* The **multiplication operator** ```*``` is used to find the **product*** of two integers
* The **exponential operator** ```**``` is used to raise the power of a number to an **exponent**.
* The **division operator** ```/``` is used to find the **quotient** of two integers.
* the **modulo** operator ```%``` is used to find the **remainder** of two integers.
* the **floor division** ```//``` operator is find the **quotient** of two integers rounded down to the nearest integer.

In [None]:
# Addition
print('2 + 2 ==', 2 + 2)
# Subtraction
print('2 - 2 ==', 2 - 2)
# Multiplication
print('2 * 2 ==', 2 * 2)
# Exponent
print('2 ** 4 ==', 2 ** 4)
# Division
print('2 / 2 ==', 2 / 2)
# Modulo
print('5 % 2 ==', 5 % 2)
# Floor division
print('15 // 2 ==', 15 // 2)

#### Floats

Also known as floating point decimal numbers, are used to represent numbers with decimal points. All of the same mathematical operations that can be performed with integers can also be performed on floats.

In [None]:
# Addition
print('2.2 + 2.2 ==', 2.2 + 2.2)
# Subtraction
print('2.2 - 2 ==', 2.2 - 2)
# Multiplication
print('2 * 2.2 ==', 2 * 2.2)
# Exponent
print('2.2 ** 4 ==', 2.2 ** 4)
# Division
print('2.2 / 2 ==', 2.2 / 2)
# Modulo
print('5 % 2.2 ==', 5 % 2.2)
# Floor division
print('2.44 // 2.2 ==', 2.44 // 2.2)

Scientific notation in python can also be used to represent floating point numbers.

In [None]:
print(1e-3, .001)
print(1e-8, .00000001)

#### Booleans
The **boolean** type in Python and other programming languages is the most basic. Booleans are a used to represent a truth series (either ```True``` or ```False```). In Python, boolean declarations always start with a capital letter. 

I assume the creator of Python did this so that their lower case counterparts can be used as variable declarations. That is a speculative statement and I could be wrong about that.

In [None]:
true = True
false = False
print(true, false)

Declaring boolean values outright in code is not all that interesting. It's when we start to use **comparison operators**, **Logical operators** and **ternary operators** when things get interesting.

#### Comparison Operators

In [None]:
# Greater than
print('5 > 3 ==', 5 > 3)
# Less Than
print('5 < 3 ==', 5 < 3)
# Greater than or equal to
print('5 >= 5 ==', 5 >= 5)
# Less than or equal to
print('5 <= 5 ==', 5 <= 5)
# equality
print('5 == 5 ==', 5 == 5)
# Multiple comparisons
print('1 < 2 < 3 < 4 ==', 1 < 2 < 3 < 4)
print('1 < 2 < 3 > 4 ==', 1 < 2 < 3 > 4)

While these same operator comparisons will evaluate intuitively when using them on **integers**; however, you have to be careful when using **comparison operators** on **floats**. 

The below cell outlines the problem. The statemet ```x == y``` should evaluate to ```True```. After we execute the cell, we'll see that it does not.

In [None]:
x = .1 + .1 + .1 
y = .3
print('x == y', x == y)


This is because floats are **approximate representations of real numbers**. 

When we apply mathematical operations to floats in Python, occasionally we loose precision. The cell give us greater insight into the problem. We can see by observing the printed output of the variable ```x``` that ```.1 + .1 + .1``` does not equal ```.3```. 

In [None]:
print('x == ', x)
print('y == ', y)

#### Logical Operators
**Logical operators** allow us to evaluate the truth series of multiple statements. 

The three logical operators in python are ```and```, ```or``` and ```not```. You can combine logical operators with comparison operators to create complex algorithmic logic.

In [None]:
print('True and True ==', True and True)
print('True and False ==', True and False)
print('True or False ==', True or False)
print('False or False ==', False or False)
print('False or not False ==', False or not False)
print('False or not True ==', False or not True)
print('1 == 1 and 2 == 2 ==', 1 == 1 and 2 == 2)
print('1 == 1 and 2 > 2 ==', 1 == 1 and 2 > 2)

#### Ternary Operator
The ternary operator uses ```if``` ... ```else``` syntax to dynamically assign variables based on the evaluation of a statement.

In [None]:
print('1 if 1 == 1 else 2 ==', 1 if 1 == 1 else 2)
print('1 if 2 == 1 else 2 ==', 1 if 2 == 1 else 2)

what_color_is_the_sky = 'blue' if 'sky'.startswith('s') else 'grey'

print('The color of the sky is', what_color_is_the_sky)

#### Challenge
What will the following code cell output?

In [None]:
for i in [11, 12, 14, 17]:
    eval = 0 if i % 2 == 0 else i
    print(eval)

#### Type checking
Python has built in functions to **check variable types**. You can use the ```type()``` function to obtain the type name and you can pass a variable and type into the ```isinstance()``` function to obtain a boolean value if the type matches the variable type.

In [None]:
print(type('Hello, World!'))
print(type(1))
print(type(.1))
print(type(2 + 2 == 4))

In [None]:
print("isinstance('Hello, World!', str) ==", isinstance('Hello, World!', str))
print("isinstance(1, float) ==", isinstance(1, float))
print("isinstance(.1, int) ==", isinstance(.1, int))
print("isinstance(2 + 2 == 4, bool) ==", isinstance(2 + 2 == 4, bool))

### Part 4: Conclusion
Thank you for attending our AI @ VA Community lunch are learn session. I hope you found this tutorial to be **fun** and **informative**. We're going to be taking your feedback and preparing future lunch and hopefully be able to create learn sessions that are taylored toward your **desired learning objectives** and **skill level**.

My name is **Tim Strebel** I can be reached at <a href = "mailto:Timothy.Strebel@va.gov">Timothy.Strebel@va.gov</a> or feel free to reach out to other NAII members in the community at <a href = "mailto:VHAWAS.AI.COMMUNITY@va.gov">VHAWas.AI.Community@va.gov</a>.

In [5]:
%reload_ext watermark

%watermark

Last updated: 2021-12-15T10:23:21.383033-05:00

Python implementation: CPython
Python version       : 3.9.5
IPython version      : 7.29.0

Compiler    : MSC v.1928 64 bit (AMD64)
OS          : Windows
Release     : 10
Machine     : AMD64
Processor   : Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
CPU cores   : 4
Architecture: 64bit

