<a href="https://colab.research.google.com/github/SCS-Technology-and-Innovation/IntroComp/blob/main/data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data types

In many programming languages, one must specify the *type* for each piece of information stored in the program: is it an **integer** (that is: a whole number, no decimal places, no fractions), is it a number with **decimal places**, it is a single **character**, is it a **string** of characters, and so forth. In Python, it is not mandatory to indicate the type, although there is still a type associated to the data.

In [2]:
a = 13 # an integer
print(a, type(a))

13 <class 'int'>


In [3]:
b = 12.34 # numbers with decimal places are called floating-point numbers
print(b, type(b))

12.34 <class 'float'>


In [5]:
c = 'c'
print(c, type(c))
d = 'hello'
print(d, type(d))
print(type("hello"))

c <class 'str'>
hello <class 'str'>
<class 'str'>


In Python, single characters are treated as strings of length one, whereas some other programming tools distinguish between single characters and concatenated characters forming a string. In a sense, a string is an *array* of characters.

In [7]:
print(a * b)
print(type(a * b))

160.42
<class 'float'>


In [8]:
print(a * c)
print(type(a * c))

ccccccccccccc
<class 'str'>


In [9]:
b * c # this will NOT work, you cannot make a non-integer number of copies of a string

TypeError: can't multiply sequence by non-int of type 'float'

In [10]:
a * d

'hellohellohellohellohellohellohellohellohellohellohellohellohello'

In [11]:
c + d # adding two strings concatenates them

'chello'

In [12]:
c * d # this again will not work since it has no intuitively clear meaning to it

TypeError: can't multiply sequence by non-int of type 'str'

Programming tools that have explicit data types also tend to have specific **ranges** of values that each data type can represent. These ranges may depend on the specific computer on which the code is executed. Python (version 3) does not have a limit on the size of an integer, although available memory on the computer will obviously eventually limit the values that can be stored.

In [18]:
11 * 222 * 3333 * 4444 * 55555 * 66666 * 777777 * 8888888 * 99999999

92615803638917501706845604094581223294080

Another important concern is the precision at which values are stored and processed. The arithmetic operators and the types of the values involved will determine the result in a way that is sometimes mathematically unintuitive and unexpected.

In [19]:
6 / 2

3.0

In [22]:
60 // 20

3

In [21]:
0.6 / 0.2

2.9999999999999996

Why is the last one not three exactly?

Floating-point numbers (`float`) are **not** like real numbers in math. One factor is that  since not all real numbers are rational (fractions), storing decimal values as $x = a / b$ where $a$ and $b$ are `int` would fall short. What engineers decided back in the day is to opt for
$$x = a \times b ^ c$$
where $a$ is called a **mantissa** (or *significand* or *coefficient*), $b$ is the **base** (or *radix*), and $c$ is the **exponent** (or *scale* or *characteristic*). This allows for relatively efficient storage and processing, with the cost of limitations in the precision of the values such as the case of `0.6/0.2` above, since not all real numbers can be represented in this format when $b$ is constant and there may be memory limitations how to big or small $a$ and $c$ can be.

For more information, there is a [dedicated website](https://floating-point-gui.de/) to help you understand.
