<div style="width: 38.5%;">
    <p><strong>City College of San Francisco</strong><p>
    <hr>
    <p>MATH 108 - Foundations of Data Science</p>
</div>

# Lecture 04: Data Types

Associated Textbook Sections: [4.0, 4.1, 4.2, 4.3](https://inferentialthinking.com/chapters/04/Data_Types.html)

---

## Overview

* [Data Types](#Data-Types)
* [Numbers](#Numbers)
* [Strings](#Strings)
* [Boolean Values](#Boolean-Values)
* [Type Casting](#Type-Casting)
* [No Value](#No-Value)
* [Attributes and Methods](#Attributes-and-Methods)

---

## Set Up the Notebook

In [1]:
from datascience import *
import numpy as np

## Data Types

### `type`

* Every value in Python is a certain data type
* The type of a value determines what operations can be performed on it
* The built-in function `type` can be used to show you data type of a value: `type(2)`
* An expression's "type" is based on its value, not how it looks

### Demo: Types

Use the `type` function to check an objects type.

In [2]:
type(10) # SOLUTION

int

In [3]:
a = 10 # SOLUTION
type(a) # SOLUTION

int

In [4]:
type(4.5) # SOLUTION

float

In [5]:
type('abc') # SOLUTION

str

In [6]:
type(abs) # SOLUTION

builtin_function_or_method

In [7]:
type(True) # SOLUTION

bool

In [8]:
type(make_array()) # SOLUTION

numpy.ndarray

In [9]:
type(Table()) # SOLUTION

datascience.tables.Table

---

## Numbers

---

### `ints` and `floats`

* Python has two real number types 
    * `int`: an integer of any size
    * `float`: a number with an optional fractional part
* An `int` never has a decimal point; a `float` always does
* A `float` might be printed using scientific notation
* Three limitations of `float` values:
    * They have limited size (but the limit is huge)
    * They have limited precision of 15-16 decimal places
    * After arithmetic, the final few decimal places can be wrong


---

### Demo: Numbers

Demonstrate integers and floats.

In [10]:
# int
13 # SOLUTION

13

In [11]:
# float
13.0 # SOLUTION

13.0

---

Try out some arithmetic with integers.

In [12]:
# Multiplication
4 * 5 # SOLUTION

20

In [13]:
# Exponentiation
1234 ** 5 # SOLUTION

2861381721051424

---

The number of digits Python will keep a record of is basically limited by the computer's memory.

In [14]:
123456789 ** 123

180443894485522835714954192172999280028845717532682776236340940102536698145222569108484463188835141501257568786614496374890490631237110580592001682129147574651845715171456148359301092015447205623057495772659564027213301182232076238590331900681806078027178740976490955033321310868454820309128358774485790967770683032944024352558539124788679067434454263656340662912379366253271751953736731074819372491000795297394853010105837025484346139393089929535058486260828830048634323191537485125711757553017086494262086507174576160642886415424336562785001644174485982226421972489721110767356064962137106505728778418700556641023076586539800506459911177905554903899443902042199874341653922049139720885160820424705059531702449496414152206583904252440335125073512355264351679192059781951740756716496372272101373104569806788535169770019927578333904122000732663242308371786294445444694565563343590247938552086658203292972070407426713686306344322058332865613102498986620473134625473086906778038872631750464721441869

---

Integer arithmetic with division leads to float values.

In [15]:
# Division with positive remainder
20 / 3 # SOLUTION

6.666666666666667

In [16]:
# Division with no remainder
20 / 2 # SOLUTION

10.0

---

General arithmetic that include float values lead to float values.

In [17]:
10 ** 0.5 # SOLUTION

3.1622776601683795

In [18]:
16 ** 0.5 # SOLUTION

4.0

In [19]:
2 + 2.0 # SOLUTION

4.0

In [20]:
2 * 3.0 # SOLUTION

6.0

In [21]:
# For readability, you can use _
4 / 7_000 # SOLUTION

0.0005714285714285715

---

Float values tend to produce surprising results.

In [22]:
# Almost canceling
(10 ** 0.5) ** 2 # SOLUTION

10.000000000000002

In [23]:
# floating-point rounding
0.12345678901234567890123456789

0.12345678901234568

In [24]:
0.12345678901234567890123456789 - 0.1234567890123456789

0.0

---

Scientific notation is used.

In [25]:
# For readability, you can use _
6 / 4_000

0.0015

In [26]:
6 / 400_000_000

1.5e-08

In [27]:
1.5e-08

1.5e-08

In [28]:
400_000_000 * 1.5e-08 # SOLUTION

5.999999999999999

In [29]:
2e+10 - 20_000_000_000 # SOLUTION

0.0

---

Be careful of your syntax. Sometimes common notation from mathematics or other languages do not work here.

In [30]:
x = 5

In [31]:
2 * x

10

In [32]:
# A SyntaxError
# 2x

---

## Strings

---

### Text and Strings

* A string (`str`) value is a snippet of text of any length
    * `'a'`
    * `'word'`
    * `"there can be 2 sentences. Here's the second!"`
* Strings consisting of numbers can be converted to numbers
    * `int('12')`
    * `float('1.2')`
* Any value can be converted to a string
    * `str(5)`


---

### Demo: Strings

Explore how quotation marks are used to make strings.

In [33]:
'Flavor' # SOLUTION

'Flavor'

In [34]:
"Flavor" # SOLUTION

'Flavor'

In [35]:
# A SyntaxError:
# 'Don't always use single quotes'

In [36]:
"Don't always use single quotes" # SOLUTION

"Don't always use single quotes"

---

Notice that there is some kind of "arithmetic" with strings.

In [37]:
# concatenation
'straw' + 'berry' # SOLUTION

'strawberry'

In [38]:
'straw' + ' ' + 'berry' # SOLUTION

'straw berry'

In [39]:
'ha' * 10 # SOLUTION

'hahahahahahahahahaha'

In [40]:
# A TypeError
# 'lo' * 5.5

---

## Boolean Values

---

### `bool`

* Boolean values are used to represent truth or falsehood
* They can only have two possible values:
    * `True`
    * `False`
* This data type can be used to control the flow of code and aid in data analysis.

---

## Type Casting

---

### Type Casting

* Type casting is the process of converting a value from one data type to another
* Strings that contain numbers can be converted to floats and sometimes integers
    * `int('12')`
    * `float('1.2')`
    * `float('one point two')` --- **Not a Good Idea!**
* Any value can be converted to a string
    * `str(5)`
* Numbers can be converted to other numeric types
    * `float(1)`
    * `int(1.2)` --- **DANGER: Loses Information!**
  

---

### Demo: Type Casting

Try converting between strings and other data types.

In [41]:
int('3') # SOLUTION

3

In [42]:
int(3.9) # SOLUTION

3

In [43]:
# A ValueError
# int('3.0')

In [44]:
float('3.0') # SOLUTION

3.0

In [45]:
str(4.5) # SOLUTION

'4.5'

In [46]:
2 + int("2") # SOLUTION

4

---

## No Value

---

### `NoneType`

* Somethings in Python have a `NoneType` data type.
* `None` is an example of something in Python without value.

### Demo: `NoneType`

In [47]:
type(None) # SOLUTION

NoneType

In [48]:
print("Hello")

Hello


In [49]:
type(print("Hello")) # SOLUTION

Hello


NoneType

---

## Attributes and Methods

---

### Attributes and Methods

* Python is an Object Orientated Programming language
* Almost everything in Python is an object
* Objects usually have attributes and methods associated with them
    * An attribute is some characteristic of the object
    * A method is some function associated with the object
* Using a `.` after an object will allow you to access the object's attributes and methods
* In Jupyter, press the tab key after the `.` to see a list of available attributes and methods
  

---

### Demo: Attributes and Methods

Replace a character in a string with another character.

In [50]:
a_string = "dog" # SOLUTION
a_string

'dog'

In [51]:
a_string.replace('d', 'f') # SOLUTION

'fog'

In [52]:
# a_string was not updated
a_string

'dog'

In [53]:
str.islower?

[0;31mSignature:[0m [0mstr[0m[0;34m.[0m[0mislower[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Return True if the string is a lowercase string, False otherwise.

A string is lowercase if all cased characters in the string are lowercase and
there is at least one cased character in the string.
[0;31mType:[0m      method_descriptor

In [54]:
a_string.islower() # SOLUTION

True

In [55]:
a_string = a_string.capitalize() # SOLUTION
a_string

'Dog'

In [56]:
a_string.islower() # SOLUTION

False

---

<footer>
    <p>Adopted from UC Berkeley DATA 8 course materials.</p>
    <p>This content is offered under a <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/">CC Attribution Non-Commercial Share Alike</a> license.</p>
</footer>