# Week 01: Data Types and Operations

In computer programming, an object is used for computing something. Any objects have their defined type, for example 1 is an integer, while 1.0 is a float (numbers wil decimal points). In Python there are few common data types: 
* `boolean`
* `int`
* `float`
* `str`
* `list`
* `dict`

Data types are the name of the objects. These data types will be commonly used when you program in Python. So being familiar with them is important. 

### Boolean

Boolean values are plainly `True` or `False`. For example, if someone asks `3 > 2` then the computer return `True` and vice versa. To see this in practise, one needs to do the following: 

In [None]:
# Try this
print(3 > 2)

In Python, the boolean values are keywords and they must be written with the first letter capital. The lower case `true` or `false` are treated as variables and will raise an error if they are not defined. See below if one uses the wrong format of boolean values. 

In [1]:
type(false)

NameError: name 'false' is not defined

### Integer and Float

Integers are whole numbers and their negative ones. For example, 1 is an integer. In Python integer is a data type of its own. Its data type class is called `int` in Python. One application on calling the data type is to compare if an object is an integer or not, for example, 
```python
isinstance(10,int)
```
Which will return `True`. 

In Python, to present decimal numbers we have the float data type. A float is a distinguished data type than integers. For example, 1 has an int type, but 1.0 has a float type. You can test them below: 

In [None]:
# Try this
print(type(1))
print(type(1.0))

Before we move on, one warning to begineers using Python is that computing with `floats` can result in inaccurate numbers. Python uses binary storage for any data type. While computing with decimals, the computer require converting floats into binary numbers and back. So this can be a problem sometimes. As a caveat, refrain using `floats`. 

In [None]:
# Try this
print(0.01 + 0.02)

### Strings

In computer science, a string means texts. It comes the term 'a string of characters'. To present a string, a quotation mark must wrap the texts you would like present. For example, 
```python
'Python is a easy language to learn.'
```
You can test if above is a string or not using a type() function. 

In [None]:
# Try this
type('Python is a easy language to learn.')

Just be careful about the apostrophes in strings, the compiler will thought you have finished the sentence and it can causes errors if not escapted properly (will tell you afterwards). Let's see an example if one didn't escape the apostrophes properly: 

```python
'How's your day?'
```

As you can see from above, there are parts of the string are not in red. The compiler reads the string of 'How' but afterwards. To make sure the above snipnet does not output the wrong object, we escape the apostrophe. Which is prepend the quotation character. Using the above example, it is 
```python
'How\'s your day?'
```
Alternatively, you can use double quotation marks to present a string, such as
```python
"How's your day?"
```
That way there is no need to escape the apostrophe. 

### List

When one wants to group several data into a container, we use lists in Python. For example, 
```python
['Some text', 'Some more texts']
```
This list contains 2 strings. Looking deeper into a list, each elements contains the contents and the index. An index indicates where the object is stored inside the list. To access this list, let us define it properly. 

In [None]:
ls = [1, 2, 5, 8, 9]

One note from above is that ls is often used as a variable representing a list, while the word 'list' itself represents the object in Python. Do not use list as a variable name as it will confuses the compiler. 

In [None]:
# Try this
type(ls)

Now we have a list object defined, we can call the elements now. To do that, we use square brakets to call the list. But before trying to call the objects in ls, be reminded that lists in Python starts with 0, not 1. This is common in some programming languages, but it is out of intuition. 

So let us do some practice first. 
```python
ls[0]
```
will call the first element in ls. 

In [None]:
# Try this
ls[0]

What if we call an object that is larger than the dimension of a list? You will see an error so make sure when using loops, take care how large is your lists are. 

We will look further about lists in week 3.

### Dictionary

Dictionary is a unique data type in Python. Its format is 
```python
{key: value}
```
Which is useful to record data structures of an object (i.e. key) with its description entailed (i.e. value). For example, a tweet object has a user id of the sender. It can be imagined as 
```python
{user: 'SomeUser'}
```
Note the string format in the value. Values can be any data type, such as integers, floats or even a list.  

Let us create a dictionary. It is simple as copying the above format. Suppose we have a dictionary called `Sally`, which describes a person called Sally. Here's her details: 
* She is 1.72 meters tall. 
* She has a red hair. 
* She is an engineer. 
* Her facebook name is `'SallyDunn99'`.
* She likes spending time with friends, yoga and using Instagram. 

How would you create a dictionary object for Sally? You can add more assumptions into your answer. 

In [None]:
# Try yourself. 
user = {'height': 1.72, 'hair': 'red'}

To call the values in the dictionary, we use the following format. Which is 
```python
some_dictionary['key-of-interest']
```
So the value under `'key-of-interest'` is printed. How would you call Sally's height into the compiler? Try this below. 

In [None]:
# Try yourself. 
print(user['height'])

If we want to add a new attribute of Sally, for example, nationality. We can simply say
```python
user['nationality'] = 'some-country'
```

In [None]:
# Try yourself. 
user['nationality'] = 'some-country'

print(user)

If we want to remove the attribute, we can use the `del` keyword. For example, 
```python
del user['nationality']
```

### Numpy Data Types

Numpy is a popular package in Python, it is very useful in computing large datasets. The main reason why it computes fastly is because it is written by C. C language is one of the common low-level language that can be picked up by machine easily, while Python requires interpretation to let the machine knows the instructions. 

Data types in C considers more about how your computer stores them. You might have come across with how binary numbers converts into numbers. For example, '10' in binary numbers means 2 in decimals. In computers, we store __one information__ with bits. This represents the area of computer storage, for example, have an electron or not. It really depends on how your computer is made. So let us consider the following: 

In fact, the 8 bit constraint means the largest number you can compute with is $2^8\;=\;256$. If we consider negative numbers, then the largest number we can compute is $127$. This number can be very small at many occassions. So in Numpy (or C), there are more data types that uses more or less memories in the computer. For instance, 

* `np.int8` means the number ranges from -128 to 127. 
* `np.int16` means the number ranges from -32768 to 32767. 
* `np.int32` means the number ranges from -2147483648 to 2147483647. 
* `np.int64` means the number ranges from -9223372036854775808 to 9223372036854775807. 

There are other numerical data types such as unsigned (e.g. `np.uint8`), float, long (dependent to computer, but each data must be greater than 32 bits) etc. 

## Drill

In the following, how can you tell the type of following objects are: 
* 1 has a type of int?
* 1.00 has a type of float?
* abc has a type of string?
* 'I am fine.' has a type of string?
* {1, 2, 3} has a type of list or array?
* {"1, 2, 3"} has a type of list or array?
You can test this using the Python function isinstance(obj, type) or type(obj) to help you. Test them at the next cell. 

In [3]:
# The first item is done for you. 
print(isinstance(1, int))
type(1)
# Test the types in below

True


int

In the following, explain what are these operators?

```python
# Do not compute this cell. 
+ - * / // %
```

## Python

Python is a popular language for data analytics. Its syntax is easy to write and it has a large community with many useful packages for data anlytics and scripting. In the following, let us understand the basic syntax of Python. 

```python
# This is a comment
from numpy import * # numpy is a package which contains many useful functions for us without writting them from scratch.
import sys # import the whole library. 

cmd = input('What word means happy from you?')  # input with a bracket is a function. 
intensity = input('How intense is it?')
intensity = int(intensity)
print(intensity)  # Important function! Print out variables or strings on the screen. 

if intensity > 5:   # If something is true, then the following line will be run by compiler. 
    print('Intensity has to be less than or equals to 5.')
else:   # If the 'if' statement is false, then this line will run instead. 
    print('{} means happy to the degree of {}'.format(cmd, intensity))
```

As you can see, the above code starts with importing packages (or libraries). Then it variables are defined before the main computation i started. This is the basic structure of a Python code. 

The format of a Python code is its syntax. In the following, let us explore the few cases where the Python can confuse the compiler. 

In [None]:
# Run this.
a + 1
a = 1

What is happening?

In [None]:
# Run this.
int("Social science")

What is happening?

In [None]:
# Run this.
a = 1
b = 2
a = b

What are `a` and `b` after the code has been run?

In [None]:
# Run this. 
a = 3
t = 1.4
a = a + t/2

What are the values of `a` and `t` after they run?

### Variables

As you can see above, variables is a container of data. For example, instead of writing `20` to represent someone's age. We could say 
```python
age = 20
```
This is convenient when we need to use this for computation. It will be discussed in the next section with operators. 

In Python, there are conventions to name variables. It could be to align with the __specifications__, or it allows other people to read your code properly (i.e. readibility). The following the rules to name your variables in Python: 
* Technical specifications
    * __Never start your variable with a number__
    * You can use digits and alphabets as variable names
    * The variable name is not a keyword in Python (e.g. `class` or `print`)
* Conventions
    * __Snake case__ Words are separated by underscores (e.g. `snake_case`)
    * __Camel case__ All words start with a capital (e.g. `CamelCase`)
    * __Pascal case__ Second and subsequent words are capitalised (e.g. `pascalCase`)


For example, these are correct 
```python
Age
FirstClass
```
but the following are not
```python
class
1stClass
```

__Exercise:__ Can you tell why the latter variables are incorrect?

__Solution:__
* `class` is reserved for classes in Python
* `1stClass` starts with a number

## Python Operators

Operators are symbols that computes the results using one or more variables. There are different types of operators. A mathematical operator transforms from the object. For example, negative operator gives the negative number. A logical operator computes either true or false by aggregating several open variables. 

So let us see from examples. 

### Mathematical Operators

In Python, to compute plus, minus, times and divide are easy. Type your code in the `print()` function. 

__Exercise:__ Compute 233 + 238

In [None]:
# Your code here
print(???)

In [None]:
# Solution
print(233+238)

__Exercise:__ Compute variable `a` which is the result of 2 plus three times of `x`. `x` is predfined in this exercise as `x = 26`. 

In [None]:
# Your code here


In [None]:
# Solution
x = 26
a = 2 + 3*x
a

You can also do a power (exponentials) in Python. Which is using `**`. 

In [None]:
print(3*3 == 3**2)
# If returns True, then the power operator works. 

__Exercise:__ How could you test if the power symbol is `**`? Use a test case to help you. You may find `==` would be useful for comparing if both sides are equal. 

In [None]:
# Your code here

In [None]:
# Solution (For example)
2 ** 3 == 2 * 2 * 2

In Python, there are 2 different division operator. `/` the (normal) __division__ and `//` the __floor division__. Which the latter means only the integer part remains. Try it below: 

In [None]:
print(8/3)
print(8//3)

As you can see `8/3` will return the decimals 2.66..., while the latter `8//3` will return 2. 

__Exercise:__ What would the data types be after the division?

In [1]:
# Solution
print(type(5/3))
print(type(5//3))

<class 'float'>
<class 'int'>


We can also find the remainder of a division using the modulus operator `%`. For example, 
```python
8 % 3
```
will result in $2$. Which means The largest multiples of $3$ will still have 2 remaining before reaching $8$. You can see it in the example below. While $2$ groups of 3 biscuits each are chosen, so $2$ left. 

<img src="fig/Remainder.png" width="620"/>

What about negative modulus? From the example below 

In [2]:
# Try this
print(8 % -3)

-1


What if we would like to use multiple mathematical operators? Universally there is a rule where which mathematical operator is computed first in a line. This is followed as (highest priority to lowest): 
* brackets `()`
* exponential `**`
* positive and negative numbers and bitwise negative `+X`, `-X`, `~X`
* multiplication `*`, divisions `/`, `//`
* plus and minus `+`, `-`
* comparision operators `==`, `<`, `>`, `<=`, `is`, `not`, `in`, `not in`
* logical `not`
* logical `and`
* logical `or`

This means the highst priority will be computed first, such as brackets. 

__Exercise:__ What is the result of `(2 * 3) + 5 * 7//2`?

In [5]:
(2 * 3) + 5 * 7//2

23

In [None]:
# Try it yourself
2 * 3
7//2
5* 7//2
(2 * 3) + 5
(2 * 3) + 5 * 7//2

### Logical Operators

To understand logical operators, let us refer to the following quote: 

_to be or not be_
    
This quote from Shakesphere depicts the dilemma of Hammet. It also shows when an object is true and false at the same time makes confusion. This is an application of logics. In life we speaking things that itself is true or false. Often when multiple statements conjunct together, it becomes false. This conjunction is a logical operator. 

The common logical operators are: AND, OR, NOT, XOR. We will explain them below. 

The `AND` operator means both input elements are true, results to `True`. 

<table>
    <thead>
    <tr>
        <td colspan="3" style="text-align:centre"><b>AND</b></td>
    </tr>
    </thead>
    <tr>
        <td>A</td>
        <td>B</td>
        <td>A AND B</td>
    </tr>
    <tr>
        <td>1</td>
        <td>1</td>
        <td>1</td>
    </tr>
    <tr>
        <td>1</td>
        <td>0</td>
        <td>0</td>
    </tr>
    <tr>
        <td>0</td>
        <td>1</td>
        <td>0</td>
    </tr>
    <tr>
        <td>0</td>
        <td>0</td>
        <td>0</td>
    </tr>
</table>

For example, if you want to see whether 2 people both agreed to do something. Let us define, 
```python
Stacey = True
Marilyn = True
```
We can use `and` to combine their opinion. So we write 
```pyton
print(Stacey and Marilyn)
```

The `OR` operator returns true if any of the inputs are true. It can be summarised by the truth table below: 

<table>
    <thead>
    <tr>
        <td colspan="3" style="text-align:centre"><b>OR</b></td>
    </tr>
    </thead>
    <tr>
        <td>A</td>
        <td>B</td>
        <td>A OR B</td>
    </tr>
    <tr>
        <td>1</td>
        <td>1</td>
        <td>1</td>
    </tr>
    <tr>
        <td>1</td>
        <td>0</td>
        <td>1</td>
    </tr>
    <tr>
        <td>0</td>
        <td>1</td>
        <td>1</td>
    </tr>
    <tr>
        <td>0</td>
        <td>0</td>
        <td>0</td>
    </tr>
</table>

The `OR` operator has one thing different with the daily language. Often when we use 'or' in our language, we mean one but not ther other. This is called the exclusive OR in logics and we use `XOR` to represent them. 

<table>
    <thead>
    <tr>
        <td colspan="3" style="text-align:centre"><b>XOR</b></td>
    </tr>
    </thead>
    <tr>
        <td>A</td>
        <td>B</td>
        <td>A XOR B</td>
    </tr>
    <tr>
        <td>1</td>
        <td>1</td>
        <td>0</td>
    </tr>
    <tr>
        <td>1</td>
        <td>0</td>
        <td>1</td>
    </tr>
    <tr>
        <td>0</td>
        <td>1</td>
        <td>1</td>
    </tr>
    <tr>
        <td>0</td>
        <td>0</td>
        <td>0</td>
    </tr>
</table>

It is rarely used in application unlike `AND` and `OR`. 

Finally, let us look at the `NOT` operator. This simply means not the event. So here is the truth table for the `NOT` operator. 

<table>
    <thead>
    <tr>
        <td colspan="3" style="text-align:centre"><b>NOT</b></td>
    </tr>
    </thead>
    <tr>
        <td>A</td>
        <td>NOT A</td>
    </tr>
    <tr>
        <td>1</td>
        <td>0</td>
    </tr>
    <tr>
        <td>0</td>
        <td>1</td>
    </tr>
</table>

In Python, these logical operators are referred as the small capital `and`, `or`, `xor` and `not`. For example, `3 > 2` in Python will return `True`. So if we are looking for the opposite of this, we type `not(3 > 2)`. 

## Print

Finally, let us use data types and operations into applications. The most prominent one is printing into console. In programming, we often use a command line (Sometimes you will see this as the black and white screens with some geeky fonts on it) to program. To show what we have done, we need to use print our work into the console. If you are not sure where the data is printing into, try the following code: 

In [None]:
# Try me
print('Hello world!')

You can see that "Hello world!" is printed on the screen. This is a very useful feature in programming. For example, if you want to see the results after a code is run, or you would like debug from a large scale of code, you will use this `print()` function. 

We can print variables using `print()`. This can be done by either 3 ways: 
* `C` syntax
* `format()` method 
* f-string

These 3 methods consist of different syntaxs to print objects. To see the comparision, let us use an example below. We would like to print the following sentence: "I travel to school by `<transport>`" everyday. Where `<transport>` can be substituted by the variable `transport`. This means 

__`C` syntax__

Since Python is derived from `C` language, they also borrowed their method to print strings as well. To print the transport, you will need to know what data type are you trying to print. For example: 
* `%s` means strings
* `%d` means integers
* `%f` means float
After the string itself, you will need to add a `%` after the string and the variable names. 
```python
print("I travel to school by %s. " % transport)
```

If there are multiple variable names, you can put all variables to print inside a bracket, with respect to order. For example, 
```python
transport1 = 'bus'
transport2 = 'foot'
print("I travel to school by %s and %s. " % (transport1,transport2))
```

__`.format()`__

The `.format()` method simply specifies the variables as the argument of the method itself. To use it, append the method after the string you would like to print. 
```python
print("I travel to school by {}. ".format(transport))
```

As a side note, you can print the literal string inside the argument of `.format()`. This means 
```python
print("I travel to school by {}. ".format('bus'))
```
is the same as the example. If there are multiple variable names, you can put all variables to print inside a bracket, with respect to order. For example, 
```python
transport1 = 'bus'
transport2 = 'foot'
print("I travel to school by {} and {}. ".format(transport1,transport2))
```

__f-string__

f-string is the most recent method to print using Python. To do so, you will need to add `f` in front of a string with the curly bracket to print the variable you need. The variable name must be inside the curly bracket. 
```python
print(f"I travel to school by {transport}. ")
```

You can try this example in below: 

In [4]:
# Try me

transport = "bus"

print("I travel to school by %s. " % transport)
print("I travel to school by {}. ".format(transport))
print(f"I travel to school by {transport}. ")

I travel to school by bus. 
I travel to school by bus. 
I travel to school by bus. 


You can also print computed values using any of the methods above. 

__Exercise:__ Print the following sentence: "22 + 33 * 3 = {}" where the curly bracket is the result from the computation. Use either `.format()` or f-string. 

In [6]:
# Your code


In [5]:
# Solution

## % method
print("22 + 33 * 3 = %d" % (22 + 33 * 3))

## Format
print("22 + 33 * 3 = {}".format(22 + 33 * 3))

## f-string
print(f"22 + 33 * 3 = {22 + 33 * 3}")

22 + 33 * 3 = 121
22 + 33 * 3 = 121
22 + 33 * 3 = 121


### Formatting Types

When we are trying to display numbers, we can format the numbers to make the output looks pretty. To format your variables, you will need to start the bracket by a colon. For example, you can display __2 decimal points__ by `{:.2f}` where 
* `.` means decimal points. 
* `2` for 2 decimal points. 
* `f` for float.

|        | Explanation                                      | Example                   | Output   |
|--------|--------------------------------------------------|---------------------------|----------|
| `:d`   | Integers                                         | `'{:d}'.format(20)`       | $20$     |
| `:3d`  | Integers occupy 3 digits                         | `'{:3d}'.format(20)`      | $ 20$    |
| `:03d` | Integers occupy 3 digits and "0" trailing spaces | `'{:03d}'.format(20)`     | $020$    |
| `:f`   | Float                                            | `'{:f}'.format(3.1415)`   | $3.1415$ |
| `:.2f` | Float with 2 decimal places                      | `'{:.2f}'.format(3.1415)` | $3.14$   |
| `:%`   | Percentage format                                | `'{:%}'.format(0.20)`       | $20\%$   |

__Exercise:__ Display the following sentence with the suitable format: "From the survey of {} respondants, {} people specified they have observed stray cats on the streets. Most respondants have concerned their welfare, {} respondants have seen the cats looked '{}, {} and {}'... This is {} more than last year."

Specification (with respect to brackets order of appearing):
* The number of surveyed respondants. The value is stored in `respondants`.
* The number of people have seen stray cats on the streets. The value is stored in `seen`.
* $35\%$ is the percentage of respondants are concerning. The value is stored in `concerned`. Format this variable with no decimal points. 
* The opinions are stored in `desc`.
* The last value is stored in `growth` and should be shown in 2 decimal points. Use percentage format. 

In [28]:
# Solution
respondants = 3600
seen = 1200
concerned = 0.35
desc = ['weak','endangered','futile']
growth = 0.328445

## Format
print("From the survey of {} respondants, {} people specified they have observed stray cats \
      on the streets. Most respondants have concerned their welfare, {:.0%} \
      respondants have seen the cats looked '{}', '{}' and '{}'...\
      This is {:.2f} more than last year.".format(respondants, seen, concerned, desc[0], desc[1], desc[2], growth))

print()
## f-string
print(f"From the survey of {respondants} respondants, {seen} people specified they have observed stray cats \
      on the streets. Most respondants have concerned their welfare, {concerned:.0%} \
      respondants have seen the cats looked '{desc[0]}, {desc[1]} and {desc[2]}'... This is {growth:.2f} more than last year.")

From the survey of 3600 respondants, 1200 people specified they have observed stray cats       on the streets. Most respondants have concerned their welfare, 35%       respondants have seen the cats looked 'weak', 'endangered' and 'futile'...      This is 0.33 more than last year.

From the survey of 3600 respondants, 1200 people specified they have observed stray cats       on the streets. Most respondants have concerned their welfare, 35%       respondants have seen the cats looked 'weak, endangered and futile'... This is 0.33 more than last year.


In [None]:
# Do not change the values 
respondants = 3600
seen = 1200
concerned = 0.35
desc = ['weak','endangered','futile']
growth = 0.328445

# Format the string
print("From the survey of {} respondants, {} people specified they have observed stray cats \
      on the streets. Most respondants have concerned their welfare, {} \
      respondants have seen the cats looked  '{}', '{}' and '{}'... This is {} more than last year.")

## Conclusion

From today, we have looked at: 
* Data types in Python
* The basic syntax of Python
* How to use Jupyter and its advantages
* Mathematical and logical operators in Python
* Display standard output

## Further Reading

* [String format from W3C School](https://www.w3schools.com/python/ref_string_format.asp)