# Lecture 2-1
# Basic Data Types and Variables

## Week 2 Monday

## With thanks to Miles Chen, PhD

### Adapted from *Think Python* by Allen B. Downey and *A Whirlwind Tour of Python* by Jake VanderPlas

## Values and Types

There are different types of data in **base Python**. Other important data types also exist, but only after loading libraries like NumPy and Pandas. We will first begin with those in base.

The most commonly used ones will be:

- `str` - strings: for text data
- `int` - integers
- `float` - floats for numbers with decimal values
- `bool` - boolean: True or False
- `NoneType` - The reserved name and value `None` is used to indicate Null values

Python has important data structures that we will cover later, including:

- sequences: `list` `tuple` and `range`
- mappings: `dict`
- sets: `set`
- binary: `bytes`

## Values and Types (cont'd)

In [1]:
type("2")

str

In [2]:
type(2)

int

In [3]:
type(2.0)

float

## Values and Types (cont'd)

In [4]:
type(True)

bool

In [5]:
type(None)

NoneType

## Type Conversion examples

In [6]:
str_to_int = int("2")
type(str_to_int)

int

In [7]:
int_to_str = str(1)
print(int_to_str)
type(int_to_str)

1


str

## Type Conversion examples (cont'd)

In [8]:
int_ex = 2
print(type(int_ex))
int_to_float = float(int_ex)
print(type(int_to_float))

<class 'int'>
<class 'float'>


In [9]:
print(int(True))
print(int(False))

1
0


# Math operations in Python

Base Python has only a few math operations

- `x + y`	sum of x and y.
- `x * y`	multiplication of x and y.
- `x - y`	difference of x and y.
- `x / y`	division of x by y.
- `x // y`	integer floor division of x by y.
- `x % y`	integer remainder of `x//y`
- `x ** y`	x to the power of y
- `abs(x)`	absolute value of x

Adding integers together results in an integer

In [10]:
x = 10
y = 5
print(type(x))
print(type(y))

<class 'int'>
<class 'int'>


In [11]:
z = x + y
type(z)

int

In [12]:
z

15

Multiplying integers together results in an integer.

In [13]:
x = 10
y = 5

In [14]:
z = x * y
type(z)

int

In [15]:
z

50

Division always results in float

In [16]:
x = 10
y = 5

In [17]:
z = x / y
type(z)

float

Floats are always displayed with a decimal point even if it is a whole number.

In [18]:
z

2.0

The sum or product of integer with a float results in float

In [19]:
x = 10.0
y = 5
print(type(x))
print(type(y))

<class 'float'>
<class 'int'>


In [20]:
x + y

15.0

In [21]:
x * y

50.0

# Floating Point Type

A floating point number uses 64 bits to represent decimal values. It can represent many values but only a finite number of distinct values.

A floating point number is capable of approximately 16 places of precision.

It has a maximum value of `1.7976931348623157e+308` which is `sys.float_info.max` (a little less than $2^{1024}$)

In [22]:
2.0 ** 1023

8.98846567431158e+307

In [23]:
2.0 ** 1023 + 2.0 ** 1022 + 2.0 ** 1021

1.5729814930045264e+308

## Floating Point Type (cont'd)

In [24]:
2.0 ** 1024 # this is too big to be represented with 64 bits in double floating point

OverflowError: (34, 'Result too large')

Side effect: Floating point numbers do not work the same way real numbers do.

In [25]:
a = (1 + 2) / 10

In [26]:
b = (1/10 + 2/10)

In [27]:
a == b # with real numbers, we expect these to be equal

False

In [28]:
print("%0.20f" % a) # format to print 20 places after decimal

0.29999999999999998890


## Floating Point Type (cont'd)

In [29]:
print("%0.20f" % b)

0.30000000000000004441


To check if two numbers are approximately equal, you can use `isclose()` in the `math` library.

In [30]:
import math
math.isclose(a, b)

True

# Integer type

Integers in Python use variable amounts of memory and can show very large numbers with great precision.

In [31]:
2 ** 1023

89884656743115795386465259539451236680898848947115328636715040578866337902750481566354238661203768010560056939935696678829394884407208311246423715319737062188883946712432742638151109800623047059726541476042502884419075341171231440736956555270413618581675255342293149119973622969239858152417678164812112068608

In [32]:
2 ** 1024

179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216

In [33]:
2 ** 1025

359538626972463181545861038157804946723595395788461314546860162315465351611001926265416954644815072042240227759742786715317579537628833244985694861278948248755535786849730970552604439202492188238906165904170011537676301364684925762947826221081654474326701021369172596479894491876959432609670712659248448274432

### Exponentiation

In [34]:
9 ** 2 # power operator. Can result in float or int depending on input.

81

There is no square root function in base Python

In [35]:
sqrt(9)

NameError: name 'sqrt' is not defined

### Alternative to Square Root

In [36]:
9 ** 0.5 # could work as an alternative to sqrt function.

3.0

Mathematical constants and many math functions are not defined in base Python. To gain access to common mathematical constants and functions, you must load the `math` library.

In [37]:
pi

NameError: name 'pi' is not defined

In [38]:
exp(2)

NameError: name 'exp' is not defined

In [39]:
sin(0)

NameError: name 'sin' is not defined

In [40]:
cos(0)

NameError: name 'cos' is not defined

## about the `math` module

to do math using Python, you could import the `math` module.  The math module is part of the Python Standard Library that is common to most distributions of Python.  This module is used for the most basic computations involving scalars.

The `numpy` (num-pee) library is not part of a standard distribution BUT it does come with the Anaconda distribution.  This is what you use for matrices, arrays, large data. 

![](PSL.png)

![](Numpy.png)

In [41]:
import math

In [42]:
math.sqrt(9)

3.0

In [43]:
math.pi

3.141592653589793

In [44]:
math.exp(2)

7.38905609893065

In [45]:
math.sin(math.pi / 2) # the math.sin function uses radians

1.0

# Boolean Type

Booleans are used to express True or False

In [46]:
type(True)

bool

In [47]:
type("True")

str

There is only one accepted spelling of `True` and `False`. All other spellings will not be the same as the boolean value.

In [48]:
type(TRUE) # TRUE or T or t or true

NameError: name 'TRUE' is not defined

# String Type

Strings in Python are created with single or double quotes

In [49]:
message1 = "Hello! How are you?"
message2 = 'fine'

Some string functions. We'll cover strings more thoroughly in a later lecture

In [50]:
len(message1) # number of characters

19

In [51]:
list(reversed(message2))

['e', 'n', 'i', 'f']

In [52]:
message2[::-1]

'enif'

# Variables and Assignment

An assignment statement assigns a value to a variable name. It is done with a single equal sign. `=`

The name **must** be on the left-hand side of the equal sign.

The value being assigned must be on the right-hand side of the equal sign.

When an assignment operation takes place, Python will not output anything to the screen.

In [53]:
n = 5

In [54]:
print(n)

5


# Python Variables are Pointers

Contrast Python to other languages like C or Java. In those languages, when you define a variable, you define a container or 'bucket' that stores a certain kind of data.

```
// C code
int x = 4;
```

The above line defines a 'bucket' in memory intended for integers called `x` and we are placing the value 4 in it.



## Python Variables are Pointers (cont'd)

In Python, when we write:

In [55]:
x = 4

We are defining a *pointer* called `x` that points to a bucket that contains the value 4. With Python, there is no need to "declare" variables. 

In Python, we are allowed to have the variable point to a new object of a completely different type. Python is *dynamically-typed*.

We can do the following with no problems:

In [56]:
x = 1         # x points to an integer
x = "hello"   # x points to a string
x = [1, 2, 3] # x points to a list

# Variable Names

You can choose almost anything to be a variable name.

A few rules: 

- names can have letters, numbers, and underscore characters `_`
- must not start with a number
- no symbols other than underscore
- no spaces
- cannot be a Python keyword

## Reserved Python Keywords

~~~
False      await      else       import     pass
None       break      except     in         raise
True       class      finally    is         return
and        continue   for        lambda     try
as         def        from       nonlocal   while
assert     del        global     not        with
async      elif       if         or         yield
~~~

## The Art of Naming Variables

As you program, do your best to think of good variable names. This is surprisingly hard to do.

The goal is being able to read your program and understand what the variable is without having to go back to the assignment statement to remember.

Some principles (taken from: https://geo-python.github.io/site/notebooks/L1/gcp-1-variable-naming.html)

- Be clear and concise.
- Be written in English.
- Not contain special characters. It is possible to use lämpötila as a varible name, but it is better to stick to ASCII (US keyboard) characters.

## Examples of variable names that are not good

In [57]:
s = "101533"

In [58]:
sid = "101533"

The above names have the problem that we have no idea what they represent.

In [59]:
finnishmeteorologicalinstituteobservationstationidentificationnumber = "101533"

This has the problem that it is too long and difficult to read

## Examples of variable names that are better
Naming conventions: 

- `snake_case` or `pothole_case` uses underscores between words
- `lowerCamelCase` or `UpperCamelCase` uses capital letters to signify new words. lower camel case starts with a lowercase letter, and upper camel case starts with an upper case letter

In [60]:
fmi_station_id = "101533"

In [61]:
fmiStationID = "101533"

## Other Naming considerations:

Taken from: https://hackernoon.com/the-art-of-naming-variables-52f44de00aad

- It is helpful if the name of a list or array is **plural**.
- If the variable contains string values including `Names` as part of the variable name can be helpful.

In [62]:
# not great
fruit = ['apple', 'banana', 'orange']

In [63]:
# good
fruits = ['apple', 'banana', 'orange']

In [64]:
# even better as Names implies the usage of strings
fruitNames = ['apple', 'banana', 'orange']

## Boolean values

Variables containing boolean values are best when they are in the form of a question that can be answered with a yes or no.

In [65]:
# not great
selected = True
write = True
fruit = True

In [66]:
# good
isSelected = True
canWrite = True
hasFruit = True

## Numeric values

If it makes sense, adding a describing word to the numeric variable can be useful

In [67]:
# not great
rows = 3

In [68]:
# better
minRows = 1
maxRows = 50
totalRows = 3
currentRow = 7

## Function/Method Names

- functions/Methods that modify an object should be named with an **action verb**.
- functions/Methods that do not modify an object but return a modified version of the object should be named with a **passive form of a verb**.

For example, a function/method that will take a list, and modify it by sorting it should be called `sort()`

On the other hand, a function/method  that takes the list, and does not modify the list itself, but simply shows a sorted version of the list can be called `sorted()`

## Learning by studying

- The language Python uses many of these best practices for naming functions. 
- You can learn by simply paying attention to how things are written in Python.

In [69]:
carBrandNames = ['Ford', 'BMW', 'Volvo', 'Toyota']
carBrandNames.sort() # sort method sorts and modifies the list itself
carBrandNames

['BMW', 'Ford', 'Toyota', 'Volvo']

### Learning by studying (cont'd)

In [70]:
carBrandNames = ['Chevrolet', 'Audi', 'Honda']
sorted(carBrandNames) # sorted function returns the sorted list, but does not modify the list

['Audi', 'Chevrolet', 'Honda']

In [71]:
carBrandNames # we see the list is unmodified

['Chevrolet', 'Audi', 'Honda']

### Learning by studying (cont'd)

In [72]:
carBrandNames

['Chevrolet', 'Audi', 'Honda']

In [73]:
carBrandNames.sorted() # this attribute does not exist

AttributeError: 'list' object has no attribute 'sorted'

### Learning by studying (cont'd)

In [74]:
sort(carBrandNames) # this function does not exist

NameError: name 'sort' is not defined

## Some functions have no arguments

In [75]:
dir()

['In',
 'Out',
 '_',
 '_1',
 '_11',
 '_12',
 '_14',
 '_15',
 '_17',
 '_18',
 '_2',
 '_20',
 '_21',
 '_22',
 '_23',
 '_27',
 '_3',
 '_30',
 '_31',
 '_32',
 '_33',
 '_34',
 '_36',
 '_4',
 '_42',
 '_43',
 '_44',
 '_45',
 '_46',
 '_47',
 '_5',
 '_50',
 '_51',
 '_52',
 '_6',
 '_69',
 '_7',
 '_70',
 '_71',
 '_72',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_i10',
 '_i11',
 '_i12',
 '_i13',
 '_i14',
 '_i15',
 '_i16',
 '_i17',
 '_i18',
 '_i19',
 '_i2',
 '_i20',
 '_i21',
 '_i22',
 '_i23',
 '_i24',
 '_i25',
 '_i26',
 '_i27',
 '_i28',
 '_i29',
 '_i3',
 '_i30',
 '_i31',
 '_i32',
 '_i33',
 '_i34',
 '_i35',
 '_i36',
 '_i37',
 '_i38',
 '_i39',
 '_i4',
 '_i40',
 '_i41',
 '_i42',
 '_i43',
 '_i44',
 '_i45',
 '_i46',
 '_i47',
 '_i48',
 '_i49',
 '_i5',
 '_i50',
 '_i51',
 '_i52',
 '_i53',
 '_i54',
 '_i55',
 '_i56',
 '_i57',
 '_i58',
 '_i59',
 '_i6',
 '_i60',
 '_i61',
 '_i62',
 '_i63',
 '_i64',
 '_i65',
 '_i66

# Function and Method

- Python has both functions and methods (other languages have one or the other)
- Some people use them interchangeably but there are some differences

Functions
- are not defined inside a class
- may not need an argument  (e.g. dir())
- the syntax is `<functionname>(<additional parameter values>)`

Methods
- are functions that belong to an object
- always take an argument
- the syntax is `<expr>. <methodname>(<additional parameter values>)`