<a href="https://colab.research.google.com/github/veyselberk88/Data-Science-Tools-and-Ecosystem/blob/main/lec04.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="./ccsf.png" alt="CCSF Logo" width=200px style="margin:0px -5px">

# Lecture 04: Data Types

Associated Textbook Sections: [4.0, 4.1, 4.2, 4.3](https://ccsf-math-108.github.io/textbook/chapters/04/Data_Types.html)

---

## Overview

* [Data Types](#Data-Types)
* [Numbers](#Numbers)
* [Strings](#Strings)
* [Boolean Values](#Boolean-Values)
* [Type Casting](#Type-Casting)
* [No Value](#No-Value)
* [Attributes and Methods](#Attributes-and-Methods)

---

## Set Up the Notebook

In [None]:
from datascience import *
import numpy as np

---

## Data Types

---

### `type`

* Every value in Python is a certain data type
* The type of a value determines what operations can be performed on it
* The built-in function `type` can be used to show you data type of a value: `type(2)`
* An expression's "type" is based on its value, not how it looks

---

### Demo: Types

Use the `type` function to check an objects type.

In [None]:
type(10)

int

In [None]:
a = 10
type(a)

int

In [None]:
type(4.5)

float

In [None]:
type("string")

str

In [None]:
type(abs)

builtin_function_or_method

In [None]:
type(True)

bool

In [None]:
type(make_array)

function

In [None]:
type(Table())

datascience.tables.Table

---

## Numbers

---

### `ints` and `floats`

* Python has two real number types
    * `int`: an integer of any size
    * `float`: a number with an optional fractional part
* An `int` never has a decimal point; a `float` always does
* A `float` might be printed using scientific notation
* Three limitations of `float` values:
    * They have limited size (but the limit is huge)
    * They have limited precision of 15-16 decimal places
    * After arithmetic, the final few decimal places can be wrong


---

### Demo: Numbers

Demonstrate integers and floats.

In [None]:
# int
10

10

In [None]:
# float
10.0

10.0

---

Try out some arithmetic with integers.

In [None]:
# Multiplication
9+2

11

In [None]:
# Exponentiation
9**3

729

---

The number of digits Python will keep a record of is basically limited by the computer's memory.

In [None]:
123456789 ** 123

180443894485522835714954192172999280028845717532682776236340940102536698145222569108484463188835141501257568786614496374890490631237110580592001682129147574651845715171456148359301092015447205623057495772659564027213301182232076238590331900681806078027178740976490955033321310868454820309128358774485790967770683032944024352558539124788679067434454263656340662912379366253271751953736731074819372491000795297394853010105837025484346139393089929535058486260828830048634323191537485125711757553017086494262086507174576160642886415424336562785001644174485982226421972489721110767356064962137106505728778418700556641023076586539800506459911177905554903899443902042199874341653922049139720885160820424705059531702449496414152206583904252440335125073512355264351679192059781951740756716496372272101373104569806788535169770019927578333904122000732663242308371786294445444694565563343590247938552086658203292972070407426713686306344322058332865613102498986620473134625473086906778038872631750464721441869

---

Integer arithmetic with division leads to float values.

In [None]:
# Division with positive remainder
20/3

6.666666666666667

In [None]:
# Division with no remainder
20/5

4.0

---

General arithmetic that include float values lead to float values.

In [None]:
10**0.5

3.1622776601683795

In [None]:
16**0.5

4.0

In [None]:
2+2.0

4.0

In [None]:
2*3.0

6.0

In [None]:
# For readability, you can use _
32_6000_000*15_000

4890000000000

---

Float values tend to produce surprising results.

In [None]:
# Almost canceling
(10**0.5)**2

10.000000000000002

In [None]:
# floating-point rounding
0.12345678901234567890123456789

0.12345678901234568

In [None]:
0.12345678901234567890123456789 - 0.1234567890123456789

0.0

---

Scientific notation is used.

In [None]:
# For readability, you can use _
6 / 4_000

0.0015

In [None]:
6 / 400_000_000

1.5e-08

In [None]:
1.5e-08

1.5e-08

In [None]:
400_000_000*1.5e-08

5.999999999999999

In [None]:
200_000_000-2e+08

0.0

---

Be careful of your syntax. Sometimes common notation from mathematics or other languages do not work here.

In [None]:
x = 5

In [None]:
2 * x

10

In [None]:
# A SyntaxError
2x

SyntaxError: invalid decimal literal (530253802.py, line 2)

---

## Strings

---

### Text and Strings

* A string (`str`) value is a snippet of text of any length
    * `'a'`
    * `'word'`
    * `"there can be 2 sentences. Here's the second!"`
* Strings consisting of numbers can be converted to numbers
    * `int('12')`
    * `float('1.2')`
* Any value can be converted to a string
    * `str(5)`


---

### Demo: Strings

Explore how quotation marks are used to make strings.

In [None]:
"Don't worry"

In [None]:
'hello'

In [None]:
# A SyntaxError:
"Don't always use single quotes"

In [None]:
""

---

Notice that there is some kind of "arithmetic" with strings.

In [None]:
# concatenation
"straw"+"berry"

In [None]:
"ha"*10

In [None]:
"ha"**8

In [None]:
# A TypeError
'lo' * 5.5

---

## Boolean Values

---

### `bool`

* Boolean values are used to represent truth or falsehood
* They can only have two possible values:
    * `True`
    * `False`
* This data type can be used to control the flow of code and aid in data analysis.

---

## Type Casting

---

### Type Casting

* Type casting is the process of converting a value from one data type to another
* Strings that contain numbers can be converted to floats and sometimes integers
    * `int('12')`
    * `float('1.2')`
    * `float('one point two')` --- **Not a Good Idea!**
* Any value can be converted to a string
    * `str(5)`
* Numbers can be converted to other numeric types
    * `float(1)`
    * `int(1.2)` --- **DANGER: Loses Information!**
  

---

### Demo: Type Casting

Try converting between strings and other data types.

In [None]:
int("3")

3

In [None]:
type(int("3"))

int

In [None]:
# A ValueError
int('3.0')

ValueError: invalid literal for int() with base 10: '3.0'

In [None]:
type(float(3.0))

float

In [None]:
int(4.2)

4

In [None]:
2+int("2")

4

In [None]:
float(4)

4.0

---

## No Value

---

### `NoneType`

* Somethings in Python have a `NoneType` data type.
* `None` is an example of something in Python without value.

### Demo: `NoneType`

In [None]:
type(None)

NoneType

In [None]:
print("Hello")

Hello


In [None]:
type(print("Hello"))

Hello


NoneType

---

## Attributes and Methods

---

### Attributes and Methods

* Python is an Object Orientated Programming language
* Almost everything in Python is an object
* Objects usually have attributes and methods associated with them
    * An attribute is some characteristic of the object
    * A method is some function associated with the object
* Using a `.` after an object will allow you to access the object's attributes and methods
* In Jupyter, press the tab key after the `.` to see a list of available attributes and methods
  

---

### Demo: Attributes and Methods

Replace a character in a string with another character.

In [None]:
a_string = "dog"
a_string

'dog'

In [None]:
a_string=a_string.replace("d","g")
a_string

'gog'

In [None]:
# a_string was not updated
a_string

'gog'

In [None]:
?str.islower

[0;31mSignature:[0m [0mstr[0m[0;34m.[0m[0mislower[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Return True if the string is a lowercase string, False otherwise.

A string is lowercase if all cased characters in the string are lowercase and
there is at least one cased character in the string.
[0;31mType:[0m      method_descriptor

In [None]:
a_string.islower()

True

In [None]:
a_string = a_string.capitalize()
a_string

'Gog'

In [None]:
a_string.islower()

False

---

## Attribution

This content is licensed under the <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)</a> and derived from the <a href="https://www.data8.org/">Data 8: The Foundations of Data Science</a> offered by the University of California, Berkeley.

<img src="./by-nc-sa.png" width=100px>