# Lecture 1

(Summer 2023)

## Outline of topics for this segment:

1. Brief history of python
2. Python's place in the family of programming languages
3. Basic variables (integer, float, string)
4. Conversions between the basic variable types
5. Boolean variables
6. "Collection" data types
    * lists
    * tuples
    * sets
    * dictionaries
7. Operators
8. Control statements
9. Exercise: Setup Github ...
10. Functions
11. Exercise: Implement the Babylonian Algorithm
12. Read the tutorial material on VS code installation

## Some useful background material:

The <a href="https://the-examples-book.com/book/introduction" target="_blank">Purdue Data Mine Examples Book</a> contains many useful chapters on data science. While they have not been directly designed for this class, they may be useful.


## 1. History of Python, Etc.

- Conceived by Guido van Rossum in December 1989 at the Center Wiskunde and Informatica (Dutch national research institute for mathematics and computer science).
- Python version 1.0 in January 1994.
- GNU General Public License (open source) since version 1.6.1.
- Python version 2.0 in October 2000.
- Python Software Foundation formed in 2001 and a new open source license.
- Python version 2.7 was the last release in the version 2 series. Support ended January 2020.
- Python version 3.0 released December 2008. It broke backward compatibility with much of the verson 2 code.
- Latest version is 3.10.4 (March 2023).

In [None]:
# Check the version of python that we are running here ...

!python -V

## 2. Python's place in the family of programming languages

### According to stackoverflow survey of professional software developers ...    

<img align="left" src='Figs/DeveloperSurvey2021.png' width="500"/>    

### As concerns languages for data science ...

The contenders are Python and R. For **Python**

* Most popular among data scientists.
* Very useful in machine learning and artificial intelligence because of the availabilty of popular libraries such as scikit-learn, matplotlib, and tensorflow, etc.

For **R**

* A scripting language.
* Very good support for statistical computation and visulalization.

This course is focussed on Python instead of R because that is a better for the work of my research group. For an independent comparison of the two: <a href="https://www.ibm.com/cloud/blog/python-vs-r" target="_blank">Python vs. R: What's the Difference?</a>


## 3. Basic Variables

Python uses something called **dynamic typing**, which means that a variable is created when a value is assigned to it. The type can be changed after originally set. There are a few rules on variable names:

* Must start with a letter or underscore
* Names are case-sensitive

A python variable is more than just its value. It must also contain information about the type of the value. There is overhead associated with such flexibility. The code below illustrates three of the variable types: **integer, float, and string**.

In [None]:
# Integer, i.e., whole numbers both positive and negative. Later on 
# we will illustrate formatting the print command.

x = 4; #This statement creates a variable 'x' of type integer and assigns the value 4

# The following statements are used to illustrate Python's print() command and how it
# tells the user the type of a variable, its value, etc.

print('The type of x is:')
print(type(x))
print() # Just to give a space.
print('The value of x is:')
print(x)

In [None]:
# Floating point, i.e., computer representation of real numbers.

x = 4.0; #This statement creates a variable 'x' of floating point type and assigns the value 4.0

print('The type of x is:')
print(type(x))
print() # Just to give a space.
print('The value of x is:')
print(x)

In [None]:
# Strings. A string is a sequence of characters. They can be delimited
# by single quotes ('blah') or double quotes ("blah blah")

x = "four" #This statement creates a variable 'x' of type string and assigns the value "four"

print('The type of x is:')
print(type(x))
print() # Just to give a space.
print('The value of x is:')
print(x)

In [None]:
# Compare how the Python interpreter shows the value of a variable with, or without using
# the built-in print command

x[3]
x

## 4. Conversions
Python has a built-in command `float()` that can convert integers and certain strings to floating point numbers.

In [None]:
# Start with an int.

x = 4;

print('The type of x is:')
print(type(x))
print()
print('The value of x is:')
print(x)

In [None]:
# Convert to float

x = float(x)

print('The type of x is:')
print(type(x))
print()
print('The value of x is:')
print(x)

In [None]:
# Can also convert strings representing floating point numbers to
# float.

x = '1.67'

print('The type of x is:')
print(type(x))
print()
print('The value of x is:')
print(x)

In [None]:
# Note that when we print the string version of x it prints it just
# as if it were a floating point number, i.e., we can't tell from the
# output.

x = float(x)

print('The type of x is:')
print(type(x))
print()
print('The value of x is:')
print(x)

There is also a python command `int()`, which can covert floats to integer and certain strings to integer, and a command `str()`, which converts numbers to strings. It might be wise to experiment with them ...

## 5. Boolean type
A Boolean value has a python type **bool**. The possible values a Boolean variable can take are: **True** and **False**. These are typically used to hold the results of logical tests, which in turn can be used to control the flow of a python program.

In [None]:
x = True;

print('The type of x is:')
print(type(x))
print()
print('The value of x is:')
print(x)

## 6. Collection Data Types
There are four **collection** data types: **lists**, **tuples**, **sets**, and **dictionaries**. (Some say that a **string** is a collection data type since it is a ordered set of characters --- more later).

### A. <u>Lists</u> are ordered, changeable, and allow duplicate members:

In [None]:
# Create a list with 5 elements.

Coloradothings = ["wheat", "corn", "sugar beets", "pinto beans", 1959]

print('The type of Coloradothings is:')
print(type(Coloradothings))

In [None]:
print('The length of Coloradothings is:')
print(len(Coloradothings)) #Notice our use of the built-in python function 'len()'

In [None]:
print('The value of Coloradothings is:')
print(Coloradothings)

In [None]:
# The elements inside of Coloradothings may be of differing
# types ...

print('For Coloradothings[3] ...')
print(Coloradothings[3])
print(type(Coloradothings[3]))
print()
print('For Coloradothings[4] ...')
print(Coloradothings[4])
print(type(Coloradothings[4]))

In [None]:
Coloradothings[0]

In [None]:
# We can append to a list and insert in a list. These particular functions are called 'methods' in
# python. Different sorts of objects have particular methods that work for them.

# The append method ...

Coloradothings.append("Amherst")
print(Coloradothings)

In [None]:
# The insert method ...

Coloradothings.insert(2, "sunflowers")
print(Coloradothings)

### B. <u>Tuples</u> are ordered, unchangeable, and allow duplicate members.

In [None]:
# Make a tuple.

Indianathings = ("Basketball", "Corn")

print(type(Indianathings))
print()
print(Indianathings)

In [None]:
# This command will yield an error because append is NOT an allowable method
# for tuples ...

Indianathings.append("Wall street")

Tuples can contain a single item but to do this we must specify them with a comma after the first and only element, e.g., `Indianathings = ("Basketball")` is not a tuple (it's a string), while `Indianathings = ("Basketball",)` is a tuple.

Tuples **cannot** be changed. For example, if Indianathings is a tuple then `Indianathings.append("Wall street")` will cause an error.

### 3. <u>Sets</u> are unordered, changeable (in the sense that we can add and remove items from sets). Sets do not allow duplicates.

In [None]:
# Make a set.

Purduethings = {"Ag and Bio Engineering", "Ross-Ade Stadium", "students", "professors", "Gene Keady", "study sessions"}

Purduethings

In [None]:
print(type(Purduethings))
print()
print(Purduethings) # Note the order it prints

In [None]:
# Tests producing Boolean values ...

print("Ag and Bio Engineering" in Purduethings)

In [None]:
print("Medical School" in Purduethings)

In [None]:
print("Gene Cernan" in Purduethings)

In [None]:
print({"Gene Cernan",}.issubset(Purduethings))

In [None]:
Purduethings.add("Gene Cernan")
print({"Gene Cernan",}.issubset(Purduethings))

From the code output above we note:
1. The order in which we included the set items when defining it is not the order that python used to enumerate the items when printing. Just FYI.
2. The statement in the last print command: `"Ag and Bio Engineering" in Purduethings` is a Boolean variable.

We can perform classical set operations via the methods ... (**union**, **intersection**, **difference**, **test subset**):

In [None]:
# Make another set ...

IUthings = {"Hoosiers", "Bobby Knight", "students", "professors", "parties"}

IUthings

In [None]:
Purduethings

In [None]:
Purduethings.union(IUthings)

In [None]:
IUthings.union(Purduethings)

In [None]:
Purduethings.intersection(IUthings)

In [None]:
IUthings.difference(Purduethings)

In [None]:
Purduethings.difference(IUthings)

### D. <u>Dictionaries</u> are unordered, changeable, and indexed. Written with "{}" but made up of key-value pairs. 

A **key-value pair** is a pair of strings separated by a colon. Different key-value pairs are separated by commas. It looks like `{"key1": "value1", "key2: "value2"}`.

In [None]:
# Make some dictionaries of farm equipment.
OldCombine = {"brand": "CASE", "model": "7130", "year": 2014}
NewCombine = {"brand": "CASE", "model": "8240", "year": 2016}
Tractor1 = {"brand": "CASE", "model": "290", "year": 2013}
Pickup = {"brand": "CHEVY", "model": "Silverado", "year": 2005}
FavoriteOldCombineEver = {"brand": "JD", "model": "7720", "year": 1978, "color": "green"}

In [None]:
print(type(FavoriteOldCombineEver))
print()
print(FavoriteOldCombineEver)

**Note:** Dictionaries can contain dictionaries.

In [None]:
# Create a dictionary of farm equipment from the dictionaries of
# individual machines.

FarmEquipment = {"C1": OldCombine, "C2": NewCombine, "T1": Tractor1, "P1": Pickup, "C3": FavoriteOldCombineEver}

In [None]:
FarmEquipment

## 7. Operators

### Arithmetic operators: +, -, *, /, %, **

In [None]:
# addition

7 + 5

In [None]:
# subtraction

7 - 5

In [None]:
# multiplication

7 * 5

In [None]:
# division

7 / 5

# division

In [None]:
# remainder upon integer division

7 % 5

In [None]:
# exponentiation

7 ** 5

### Assignment operators: =, +=, -=, *=, /=, **=

In [None]:
# Assignment operators: =, +=, -=, *=, /=, **=

b = 5
a = b
print(a)

In [None]:
a += b # shorthand for a = a + b
print(a)

In [None]:
a -= b # shorthand for a = a - b
print(a)

In [None]:
a *= b # shorthand for a = a*b
print(a)

In [None]:
a /= b # shorthand for a = a/b
print(a)

In [None]:
a **= b # shorthand for a = a**b
print(a)

### Comparison operators: ==, !=, <, <=, >, >=

In [None]:
# Comparison operators: ==, !=, <, <=, >, >=

a = 3
b = 2

In [None]:
a == b

In [None]:
a != b

In [None]:
a < b

In [None]:
a <= b

In [None]:
a > b

In [None]:
a >= b

### Logical operators: and, or, not

In [None]:
# Just to recall the values set above ...

print('a equals ...')
print(a)
print()
print('b equals ...')
print(b)

In [None]:
# Logical operators: and, or, not

x = (a == b) # The expression a == b it a Boolean value (either True or False).
             # The assignment creates a Boolean variable x
print(type(x))
print(x)

In [None]:
y = not(x)
print(type(y))
print(y)

In [None]:
z = True

print(x or z)
print(x and z)

## 8. Control statements
There are three methods of program control that we consider here:
1. If/else statement
2. For loops
3. While loops

In [None]:
# Example if/else statement

a = 2;
b = 3;
if b > a:
    print("b is greater than a")
elif a == b:
    print("a and b are equal")
else:
    print("a is greater than b")

In [None]:
# While loop: Execute while condition is true.

i = 1
while i < 6:
    print(i)
    i += 1

In [None]:
# For loop: Iterate over a sequence. Also, have break (stop a loop where it is 
# and exit) and continue (move to the next iteration of loop).

for x in "banana":
    print(x)

In [None]:
# Try the continue command

for x in "banana":
    if x == "n":
        continue
    print(x)

In [None]:
# Try the break command

for x in "banana":
    if x == "n":
        break
    print(x)

## 9. Exercise: Set up Github

### Yang Wang on signing up for Github ...

<a href="https://www.youtube.com/watch?v=ZhHDfZ-l7ZU" target="_blank">Sign Up for Github</a>

### Yang Wang on ...

<a href="https://www.youtube.com/watch?v=ZB9VgHFqqXU" target="_blank">Fork a Github Repository</a>

<a href="https://www.youtube.com/watch?v=nT1NPCyTtyo" target="_blank">Commit Changes to a Github Repository</a>

### Now you should:

1. Create your own Github account if you do not already have one. Assuming you are wanting to keep this account after this summer, I would use an email that you will have for the long run.
2. Fork the repo `jvkrogmeier/REEU23`
3. Click on the mybinder link to run a version of the code.
4. Experiment with the code for `Lect-1a`

## 10. Functions
   
Function are blocks of code that run when called. 

- Can pass parameters to a function. 
- A function can return a value.

Functions allow code to be more readable by allowing the hiding of details of an operation that may not be central to the understanding of the overall algorithm. Sometimes, this is called encapsulation. For example, perhaps we want to solve some sort of geometric problem, such as finding the height of a tree from the angle of the sun and the length of the shadow cast on the ground. The height calculation will involve intermediate calculations of trigonometric functions of the angle (e.g., sine, cosine, tangent). These sorts of intermediate calculations are naturally left to functions in python and other programming languages.

In addition, functions ...

- Assist in divide and conquer problem solving.
- Allow to reuse the function code in other parts of a larger program.

In [None]:
# Functions start with a definition line ...

def Add5(a):          # This defines the name and fact that we pass a parameter "a"
    return a + 5      # This line "adds 5" and returns a value. The body of the function is delimited by the
                      # spaces (automatically inserted by the Jupiter notebook in this case)

In [None]:
# Test it ...
Add5(1)

In [None]:
# What about this ... ?
Add5("one")

In [None]:
Add5(1.2)

In [None]:
# Can have more than one parameter ...

def JVKAdd(a,b):
    x = a + b
    return x

In [None]:
JVKAdd(2,3)

In [None]:
# It is not required to pass a parameter ...

def JustSayHi():
    print("Hi")

In [None]:
JustSayHi()

### A Useful Function Example

According to wikipedia this algorithm goes back to the Babylonians (100 AD) and is widely used for computing square roots by hand. The idea is this. If we want to find the square root of a positive number, say $Z$, we first start with a guess $x$ hoping $x^2 \approx Z$. 

Now if the original guess is **too large,** i.e., $x^2 > Z$ then $x > Z/x$ and so we could move in the correct direction (towards smaller values of $x$) by making a new guess equal to the average of $x$ and $Z/x$, i.e.,

New guess = $(x + Z/x)/2$.

If, on the other hand, the original quess was **too small,** i.e., $x^2 < Z$ then $x < Z/x$ and using the above formula for the new guess would move in the correct direction of larger values. The algorithm is implemented in python in the function code below.

### Hand calculation example ...

Say Z = 10 and guess x = 3 for the square root. Then the next guess is the average of 3 and 3.3333..., which is approximately 3.16666... The next step in the algorithm gives an estimate of

3.1622

### Brute force coding of the hand calculation example

In [None]:
Z = 10; # Want the square root of this number.
K = 5;  # Number of iterations to try
Guess = 3; # Initial guess of the square root of Z

i = 1
RootZ = Guess;

while i <= K:
    NewRootZ = (RootZ + Z/RootZ)/2;
    RootZ = NewRootZ;
    i = i + 1
    print(RootZ)


### Play around with the numbers above ...

## 11. Exercise: Implementation of the Babylonian Algorithm for the Square Root

Write a function which encapsulates the example code given above. The inputs to the function should be:

* A positive number whose square root is desired
* An initial guess as to the value of the square root
* The number of iterations to try
