# Programming and Data Analysis

> Data Types in Python

Kuo, Yao-Jen <yaojenkuo@ntu.edu.tw> from [DATAINPOINT](https://www.datainpoint.com/)

## Variables

## It is quite useless by printing out literal values

In [1]:
print("Hello, world!")

Hello, world!


## It is more useful to refer a literal value by an object name

In [2]:
hello_world = "Hello, world!"
print(hello_world.swapcase())
print(hello_world.title())

hELLO, WORLD!
Hello, World!


## A variable is a name that refers to a value

```python
variable_name = literal_value
```

## Choose names for our variables: don'ts

- Do not use built-in functions.
- Cannot use [keywords](https://docs.python.org/3/reference/lexical_analysis.html#keywords).
- Cannot start with numbers.

Source: <https://www.python.org/dev/peps/pep-0008/>

## If you accidentally replaced built-in function with variable, use `del` to release it

```python
print = 5566
print("Hello, world!")
#del print
#print("Hello, world!")
```

## Choose names for our variables: dos

- Use a lowercase single letter, word, or words.
- Separate words with underscores to improve readability(so-called snake case).
- Be meaningful.

Source: <https://www.python.org/dev/peps/pep-0008/>

## Using `#` to write comments in our program

Comments can appear on a line by itself, or at the end of a line.

In [3]:
# Turn fahrenheit to celsius
def from_fahrenheit_to_celsius(x):
    out = (x - 32) * 5/9
    return out

print(from_fahrenheit_to_celsius(32))  # turn 32 fahrenheit to celsius
print(from_fahrenheit_to_celsius(212)) # turn 212 fahrenheit to celsius

0.0
100.0


## Everything from `#` to the end of the line is ignored during execution

## Data Types

## Values belong to different types, we commonly use

- `int` and `float` for numeric computing.
- `str` for symbolic.
- `bool` for conditionals.
- `NoneType` for undefined values.

## Use `type` function to check the type of a certain value/variable

In [4]:
print(type(5566))
print(type(42.195))
print(type("Hello, world!"))
print(type(True))
print(type(False))
print(type(None))

<class 'int'>
<class 'float'>
<class 'str'>
<class 'bool'>
<class 'bool'>
<class 'NoneType'>


## How to form a `str`?

Use paired `'`, `"`, or `"""` to embrace letters strung together.

In [5]:
str_with_single_quotes = 'Hello, world!'
str_with_double_quotes = "Hello, world!"
str_with_triple_double_quotes = """Hello, world!"""
print(type(str_with_single_quotes))
print(type(str_with_double_quotes))
print(type(str_with_triple_double_quotes))

<class 'str'>
<class 'str'>
<class 'str'>


## If we have single/double quotes in `str` values we might have `SyntaxError`

```python
mcd = 'I'm lovin' it!'
```

## Use `\` to escape or paired `"` or paired `"""`

In [6]:
mcd = 'I\'m lovin\' it!'
mcd = "I'm lovin' it!"
mcd = """I'm lovin' it!"""

## Great features of strings formed with paired `"""`

- A paragraph
- Docstring

## Use paired `"""` for a paragraph

In [7]:
storyline = """
Chronicles the experiences of a formerly successful banker\
 as a prisoner in the gloomy jailhouse of Shawshank after\
 being found guilty of a crime he did not commit. The film\
 portrays the man's unique way of dealing with his new, torturous\
 life; along the way he befriends a number of fellow prisoners,\
 most notably a wise long-term inmate named Red.
"""

In [8]:
sql_query = """
SELECT *
  FROM world
 WHERE country = 'Taiwan';
"""

## Use paired `"""` for docstring

In [9]:
def from_fahrenheit_to_celsius(x):
    """
    Turns fahrenheit to celsius.
    """
    return (x - 32) * 5/9

help(from_fahrenheit_to_celsius)

Help on function from_fahrenheit_to_celsius in module __main__:

from_fahrenheit_to_celsius(x)
    Turns fahrenheit to celsius.



## We've seen arithmetic operators for numeric values

How about those for `str`?

## `str` type takes `+` and `*`

- `+` for concatenation.
- `*` for repetition.

In [10]:
mcd = "I'm lovin' it!"
print(mcd)
print(mcd + mcd)
print(mcd * 3)

I'm lovin' it!
I'm lovin' it!I'm lovin' it!
I'm lovin' it!I'm lovin' it!I'm lovin' it!


## Format our `str` printouts

- The `.format()` way.
- The `f-string` way.

## The `.format()` way: uses `{}` for string print with format

In [11]:
def hello_anyone(anyone):
    out = "Hello, {}!".format(anyone)
    return out

print(hello_anyone("Anakin Skywalker"))
print(hello_anyone("Luke Skywalker"))

Hello, Anakin Skywalker!
Hello, Luke Skywalker!


## The `f-string` way: uses `{}` for string print with format

In [12]:
def hello_anyone(anyone):
    out = f"Hello, {anyone}!"
    return out

print(hello_anyone("Anakin Skywalker"))
print(hello_anyone("Luke Skywalker"))

Hello, Anakin Skywalker!
Hello, Luke Skywalker!


## Commonly used format

- `{:.nf}` for float format.
- `{:,}` for comma format.

In [13]:
def format_pi(pi):
    return f"{pi:.2f}"

print(format_pi(3.1415))
print(format_pi(3.141592))

3.14
3.14


In [14]:
def format_krw(ntd):
    krw = ntd * 42.67
    return f"{ntd:,} NTD to {krw:,.0f} KRW."

print(format_krw(1000))
print(format_krw(5000))

1,000 NTD to 42,670 KRW.
5,000 NTD to 213,350 KRW.


## How to form a `bool`?

- Use keywords `True` and `False` directly.
- Use relational operators.
- Use logical operators.

## Use keywords `True` and `False` directly

In [15]:
print(True)
print(type(True))
print(False)
print(type(False))

True
<class 'bool'>
False
<class 'bool'>


## Use relational operators

We have `==`, `!=`, `>`, `<`, `>=`, `<=`, `in`, `not in` as common relational operators to compare values.

In [16]:
print(5566 == 5566.0)
print(5566 != 5566.0)
print('56' in '5566')

True
False
True


## Use logical operators

- We have `and`, `or`, `not` as common logical operators to manipulate `bool` type values.
- Getting a `True` only if both sides of `and` are `True`.
- Getting a `False` only if both sides of `or` are `False`.

In [17]:
print(True and True)  # get True only when both sides are True
print(True and False)
print(False and False)
print(True or True)
print(True or False)
print(False or False) # get a False only when both sides are False
# use of not is quite straight-forward
print(not True)
print(not False)

True
False
False
True
True
False
False
True


## An example of using logical operators

Good marathon weather is often described as dry **and** cold. Say, the probabilities of dry and cold on race day are both 50%, there is a 25% of chance for good marathon weather.

In [18]:
def is_good_marathon_weather(is_dry, is_cold):
    return is_dry and is_cold

print(is_good_marathon_weather(True, True))
print(is_good_marathon_weather(True, False))
print(is_good_marathon_weather(False, True))
print(is_good_marathon_weather(False, False))

True
False
False
False


## An example of using logical operators(cont'd)

Good marathon weather is often described as dry **or** cold. Say, the probabilities of dry and cold on race day are both 50%, there is a 75% of chance for good marathon weather.

In [19]:
def is_good_marathon_weather(is_dry, is_cold):
    return is_dry or is_cold

print(is_good_marathon_weather(True, True))
print(is_good_marathon_weather(True, False))
print(is_good_marathon_weather(False, True))
print(is_good_marathon_weather(False, False))

True
True
True
False


## `bool` is quite useful in control flow and filtering data.

## Python has a special type, the `NoneType`, with a single value, None

- This is used to represent undefined values.
- It is not the same as `False`, or an empty string `''` or 0.

In [20]:
a_none_type = None
print(type(a_none_type))
print(a_none_type == False)
print(a_none_type == '')
print(a_none_type == 0)
print(a_none_type == None)

<class 'NoneType'>
False
False
False
True


## A function without `return` statement actually returns a `NoneType`.

In [21]:
def hello_anyone(anyone):
    print(f"Hello, {anyone}!")

hello_anyone("Anakin Skywalker")
hello_anyone("Luke Skywalker")

Hello, Anakin Skywalker!
Hello, Luke Skywalker!


In [22]:
func_out = hello_anyone("Anakin Skywalker")
type(func_out)

Hello, Anakin Skywalker!


NoneType

## Besides `type()` function, data types can also be validated via `isinstance()` function

In [23]:
an_integer = 5566
a_float = 42.195
a_str = "5566"
a_bool = False
a_none_type = None

print(isinstance(an_integer, int))
print(isinstance(a_float, float))
print(isinstance(a_str, str))
print(isinstance(a_bool, bool))
print(isinstance(a_none_type, type(None))) # print(a_none_type == None)

True
True
True
True
True


## Data types can be dynamically converted using functions

- `int()` for converting to `int`.
- `float()` for converting to `float`.
- `str()` for converting to `str`.
- `bool()` for converting to `bool`.

## Upcasting(to a supertype) is always allowed

`NoneType` -> `bool` -> `int` -> `float` -> `str`.

In [24]:
print(bool(None))
print(int(True))
print(float(1))
print(str(1.0))

False
1
1.0
1.0


## While downcasting(to a subtype) needs a second look

In [25]:
print(float('1.0'))
print(int('1'))
print(bool('False'))
print(bool('NoneType'))

1.0
1
True
True
