Authors: Sridhar Nerur, Samuel Jayarajan, and Mahyar Vaghefi

Affiliation: The University of Texas at Arlington

Hello and welcome to our Python class. Python's popularity has grown immeasurably over the years, and is arguably the most popular language today (see https://www.wired.com/story/python-language-more-popular-than-ever/ and http://pypl.github.io/PYPL.html). This growth in demand has been spurred in no small measure by the increasing emphasis on data-driven decision making.

Python was developed by Guido van Rossum in the Netherlands, primarily to create an accessible language that would be easy to work with. The language, which was inspired by the Monty Python series, has many strengths, not least of which are its readability, terseness of expression, support for object oriented concepts (e.g., inheritance and polymorphism), and the ever-increasing library of modules for various purposes. More importantly, it has found favor among data scientists and AI/deep learning practitioners.

    Here is what we will cover:
    a) Brief introduction to variables
    b) Arithmetic operations (+, -, *, /, %)
    c) Strings and methods associated with strings
    d) Lists
    e) Dictionaries
    f) Sets
    g) Tuples
    h) Programming constructs - branching and looping
    i) Functions
    j) File Handling
    k) Exception Handling
    l) Simple text analysis
    m) Introduction to Pandas
    n) Introduction to Numpy
    o) Working with databases (SQL)

Each of these topics will be presented in a separate Jupyter notebook.

Software recommendations:
1. Anaconda is one of my favorite IDEs. It may be downloaded from https://www.anaconda.com/distribution/. 

2. PyCharm is pretty good as well. See https://www.jetbrains.com/pycharm/.

3. I would also recommend Google colab (https://colab.research.google.com/notebooks/intro.ipynb#recent=true). In addition to providing notebooks, it also gives you free access to a GPU. 

Let us get started.......


Do the following look familiar to you?

    F = 1.8 * C + 32.0 .........(1)
    SI = P * t * r .............(2)

Equation 1 is the formula for calculating the fahrenheit temperature given the centigrade temperature. For example, if C = 20.0 degrees, the fahrenheit equivalent is: (1.8 * 20.0 + 32.0) = 68.0. In this example, F stands for "Fahrenheit" and C for "Centigrade". In this formula, we can plug in any value of C and compute the corresponding Fahrenheit temperature. Thus, C is not a fixed value but a variable that can have any reasonable Centigrade value.

Equation 2 shows you how one can determine simple interest (SI) given the principal amount (P), the rate of interest (e.g., 0.05 or 5 %) and the time (t) over which simple interest is earned. We can vary the principal, rate of interest, and time to see how SI changes.
So, you may think of a variable (such as C, F, P, t, r, SI) as a reference to a location in memory (i.e., the computer's memory) that contains a particular value. As opposed to PI (3.14....) or E (2.718..), whose values never change, variables can have any valid value at any given time. PI and E are called constants.

In Python, variable names may contain letters of the alphabet (a-z or A-Z), digits (0-9), or an underscore. Variable names may begin with a letter of the alphabet - uppercase or lowercase A through Z - or an underscore ("_"). It is customary to start special variables with an underscore. Note that a variable name CANNOT begin with a digit. The following cells show some examples of variables.

Note on executing a cell in Jupyter:
Shift-enter or control-enter may be used to run your cell. Alternatively, you can click on the "Cell" menu option and select "Run Cells".

Please note that Python ignores everything after a "#". Anything that comes after a # on a line will not be executed. Thus, # can be used for single-line comments that enhance the readability of your code. If you wish to have comments that span multiple lines, enclose them in triple quotes. 

In [5]:
a = 5 #a is a variable that has 5 assigned to it
#let us see what the variable "a" has
print(a) #Python uses the "print" function to display variables and strings

5


Some points to note:
1. Use meaningful variable names - the previous cell used "a" as a variable name, which really doesn't tell you the purpose of the variable.
2. Python - like C and Java - is case-sensitive. Therefore, variables "rate" and "Rate" are not the same. 
3. Variables generally store data of a particular type. In our example above, variable "a" stores an integer value. We say that the data type of "a" is int. Unlike languages like Java, C, or C++, Python does not require the programmer to specify the type while creating a variable.
4. Data can be of different types. Common ones are: String, int, float/double, and boolean (True or False). A string variable stores strings, an int variable stores ints, and so forth.

   Examples of data types:
    12, 25, 83, 10000 --> these are all integers (ints)/whole numbers
    12.2, 3.87, 3.14, 100.00 --> these are floats/doubles
    True, False --> these are called "bool", short for boolean
    "Viv Richards","INSY 5336","Baseball",'cricket'--> these are strings. 

Note that strings may be enclosed in single, double, or triple quotes. We may also use triple quotes for long strings that span multiple lines.


In [3]:
#let us check the data type of the variable "a" that you defined above
#Python provides the command "type" for this purpose....

type(a) #should display "int"

int

The statement "a = 5" means that the value 5 (an integer constant) is assigned to a variable called "a". The "=" is an assignment operator. The key thing to remember is that the variable that is assigned the value will be on the left hand side of "=", while what is assigned will be on the right hand side of the "equal to" sign. You could assign any value (int, float, string, bool, an object in Python, etc.) or another variable. Note that "5 = a" is incorrect. 

In [None]:
#What will this print? Run the cell to see if your reasoning is correct
a = 13
b = a + 5
print(b)

In [10]:
#Uncomment and try the following variable names and see which ones work and which ones don't
#1982_apple = 100 # should give you a syntax error --> can't start with a digit
#/_apple = 50 #also gives you a syntax error --> / is not a valid character
#apple_1982 = 23 #this should work
#_apple = "This is a special apple" #note that this is a string variable; should work

In [None]:
#use meaningful names
interest_rate = 0.05 #don't use r = 0.05
interestRate = 0.05 #camel back notation is popular in some languages


In [11]:
#boolean variable -- can be True or False
proceed = False
type(proceed)

bool

In [None]:
#simple arithmetic
amount = 1000.00
amount = amount + 100 # or, amount += 100 increase amount by 100


In [12]:
#can we do arithmetic on bool variables?
flag = True
flag + 1

2

In [13]:
#So, how did we get 2 in the cell above? When bool variables are used in arithmetic
#operations, True is replaced by 1 and False by 0
flag = False
flag + 1

1

In [1]:
#You can also assign to multiple variables at the same time, as shown below
a, b = 15, 20
print("A: ", a)
print("B: ", b)

A:  15
B:  20


In [4]:
#slightly more complex example
x, y, z = "abc" 
print("X: ", x)
print("Y: ", y)
print("Z: ", z)

X:  a
Y:  b
Z:  c


What do you think will happen if we did the following?

x, y, z = "abcd"

Try it!

Congratulations! You are off to a good start. You now know something about variables in Python. Let us go to the next IPython notebook.