# Data types in Python
<hr>

Python data types are the classification or categorization of data items. Different data types can be performed on different operations.

The following is chart about the built-in data types in Python:

```{mermaid}
flowchart TD
    A[Python Data Types] --> B[Numeric]

    subgraph B [Numeric]
        B1[Integer]
        B2[Float]
    end
    
    A --> C[Dictionary]
    A --> D[Boolean]
    A --> E[Set]
    A --> F[Sequence]

    

    subgraph F [Sequence]
        F1[List]
        F2[Tuple]
        F3[String]
    end
    
```

## Numeric: integer and float
<hr>

In [3]:
a = 1
a

1

In [1]:
b = 2.2
b

2.2

Unlike some other programming languages, Python does not require explicit variable type declarations. It can automatically infer numeric types. For example, in the code above, since a has no decimal point, Python treats it as an integer (`int`), while b contains a decimal point, so Python interprets it as a floating-point number (`float`).

We can use the Python keyword `type` to get the type of a variable.

In [22]:
type(a)

int

In [23]:
type(b)

float

Type casting refers to converting one data type into another. Python provides several built-in methods to facilitate casting including int(), float() and str().

In [25]:
float(a)  # cast an integer variable to float

1.0

In [26]:
int(b)  # cast a float variable to integer

2

In [5]:
str(a)

'1'

- When integers and floating-point numbers are mixed in an operation, the result will be automatically converted to a floating-point number.

## String
<hr>

A string is a sequence of characters. We can use **either single quotes `' '` or double quotes `"" ""`** to express a string.

In [None]:
name = 'chen'
name

In [None]:
city = "London"
city

We can concatenate strings directly using the plus operator `+`:

In [None]:
name + city

Another common concatenation method is using the `join()` method: `str.join(sequence)`, which joins elements in the sequence with specified characters to produce a new string. For example:

In [None]:
",".join(["chen", "zhang", "li"])  # concatenating 3 strings with comma

In [None]:
"-".join(["2020", "05", "13"])  # concatenating 3 strings with dash

In [None]:
" ".join(["Brunel", "University"]) # concatenating 2 strings with space

If concatnating same strings, we can use the muliplication operator `*`：

In [None]:
name = "John"
name * 3

we can access individual elements of a **sequence** using indexing. The elements in a **sequence** are indexed starting from 0, and -1 from end.

- The first element is at position 0, the second at 1, and so on. Negative indices are also supported, where **-1 refers to the last element**, -2 to the second last, and so forth.

In [None]:
name[0]  # 0 is the first index

In [None]:
name[-1]

In [None]:
name[-2]

The syntax for slicing in a **sequence** is `sequence[start:end]`, where start is the starting index (**included**) and end is the stopping index (**excluded**).

In [None]:
name[0:3]  # get the first to the third characters

In [None]:
name[1:] # get all the characters from the frist index to end

To check whether a character or substring exists in a **sequence**, you can use the `in` or `not in` operators:


In [None]:
"c" not in name

In [None]:
"Bei" in city

- input multi-line string using triple quotes `""" """` or `''' '''`.

In [None]:
str_multi = ''' this is 
a multi line
string'''

print(str_multi)

Cast other data types to string can be done by using the method `str`:

In [None]:
str(123)

In [None]:
str([1, 2, 3])

Some widely used methods for string operations:

|Method|Meaning|
|--|--|
|str.capitalize()|Capitalize the first character in the string 'str'|
|str.upper()|Convert all the characters in the string 'str' to uppercase|
|str.lower()|Convert all the characters in the string 'str' to lowercase|
|len(str)|length of the **sequence** 'str'|
|str.isnumeric()|To check if a string 'str' contains only numeric characters|
|str.isdigit()|To check if a string 'str' contains only digits (0-9)|
|str.isalpha()|To check if all characters in a string 'str' are alphabetic (letters)|
|str.replace(old, new])|Return a copy of the string  'str' where the substring 'old' is replaced by the substring 'new' |
|str.strip([chars])|Return a copy of the string  'str' with leading and trailing characters removed specified in the 'chars' argument.|
|str.split(sep=None)|Returns a list of substrings by splitting the string 'str' of the specified separator 'sep'|

```{note}
There may be more arguments in those methods. Interested readers may refer to relevant resources for further details.
```

In [None]:
str1 = "python data science"
str1.split()

In [None]:
str1 = "python Data Science"
str1.lower()

In [None]:
str1.capitalize()

In [None]:
str1 = " Brunel University London "
str1.strip(' ')

In [None]:
str1.replace("University", "School")

## List
<hr>

A Python list is a **dynamic sized** array (automatically grows and shrinks). We can store **all types** of elements (including another list) in a list. List is shown by **square brackets**``[ ]``and elements are seperated by **commas**.

In [None]:
list1 = [34, 10, 25] # a list of numerics
list1

In [None]:
list2 = ["chen", "zhang", "wang"] # a list of strings
list2

In [None]:
list3 = [10, "wang", 33] # a list of mixed type of items
list3

- The indexing and slicing of a list is same with a string.

In [None]:
list2[1]  # the second element of list2

In [None]:
list1[0:2]  # the elements indexed by 0 to 1

In [None]:
list1[0:]  # the elements indexed by 0 to the end

- Step size can ge given in slicing with the 3rd argument value

In [None]:
list3 = [4, 7, 8, 9, 10]
list3[1:4:2]  # the elements indexed by 1 to 3 with step size 2

In [None]:
list2[-2]  # the second last element in list2

- Using two colons followed by -1 indicates reverse order

In [None]:
list1[::-1]  # using two colons followed by -1 indicates reverse order

To change the value of some elements in a **sequence**, we can diretly assign values using indexing.

In [None]:
list = [21, 16, 30]
list[1] = 10  # change the value of the second element in list
list

To delete some element in a **sequence**, we can use `del` along with indexing：

In [None]:
list = [34, 46, 23]
del list[1]
list

For lists, the operators `+`, `*`, `in`, and `not in` method identically to their string counterparts. Below are examples:

In [None]:
list = [13, 23]
list = list + [21, 65]
list

In [None]:
list * 2

In [None]:
12 in list

In [None]:
12 not in list

- A 2D list is created by **nesting** one-dimensional lists within square brackets.

In [None]:
a = [[1, 2, 3], [4, 5]]
a

To visit the element in a 2D list, using two sets of squre brackets with index:

In [None]:
a[1][0]

- Adding elements into a list by `append()`, `insert()` or `extend()`.

In [None]:
list = [34, 46, 23]
list.append(3)  # append one element at the end of a list
list

In [None]:
list.insert(1, 10)  # insert an element 10 into the index 1 of the list
list

In [None]:
list.extend([4, 5, 6])  # extend the list with some other sequence elements in the end
list

- Use the `for loop` to iterate all the values in a list.

In [None]:
list = [3, 5, 8]
for i in list:
    print(i)

Some other methods：

|Method|Meaning|
|--|--|
|max(list)|Return the maximum in a list 'list'|
|min(list)|Return the minimum in a list 'list'|
|list(sequence)|Transform another suquence 'sequence' to a list type|
|sequence.count(element)|Count the occurance of an element 'element' in a **sequence** 'sequence' |
|list.reverse()|Reverse the list 'list'|
|list.pop(index)|Remove one element in a list with given index 'index'|
|list.remove(element)|Remove all elments in a list 'list' whose contents are same as element 'element' |
|list.clear()|Clear all the elements in a list 'list'|

In [None]:
a = [1, 2, 3]
a.pop(-1)
print(a)

In [None]:
a.reverse()
a

In [None]:
str1 = 'chench'
str1.count('c')


## Dictionary
<hr>

A Python dictionary is a data type that stores the value in `{key: value}` pairs. Values can be of any data type and can be duplicated, while `keys can't be repeated and must be immutable`.


In [None]:
dict = {"name": "chen", "score": 95} # create a dict
dict

To acess the value in a dict, put the `corresponding key` in the square bracket `[ ]` for the dict or use the method `get()`:

In [None]:
dict["name"]

In [None]:
dict.get("name")

To add one item in the dict, we can use the square bracket `[ ]` with the new key and value：

In [None]:
dict["major"] = "economy"
dict

To revise the value in a dict，assign a new value with the corresponding key：

In [None]:
dict["name"] = "wang"
dict["mark"] = 80
dict

To delete one pair key-value, using `del`:

In [None]:
del dict["name"]
dict

```{note}
Since key is immutable, numeric, string and tulic can be the key, but list can not be the key in the dict.
```

Other methods for operating a dict:

|Method|Meaning|
|-|-|
|len(dict)|Return the number of items in a dict 'dict'|
|dict.clear()|Remove all the items in the dict 'dict'|
|dict.get(key, default=None)| Return the corresponding value for the given 'key' in the dict 'dict'; |
||if no corresponding value, return the given default value |
|dict.values()|Return a list for all the values in the dict 'dict'|
|dict.keys()|Return a list for all the keys in the dict 'dict'|
|dict.items()|Return a list for all the items in the dict 'dict'|
|dict.popitem(key=None)|Remove an item by the given 'key';|
||if no key is given, remove the last item|

In [None]:
dict.popitem()
dict

- Use the `for loop` together with the method `values()` or `items()` for iteration

In [None]:
d = {1: 'John', 2: 'Male', 'age':22}

# Iterate over keys
for key in d: # or for key in d.keys():
    print(key)

# Iterate over values
for value in d.values():
    print(value)

# Iterate over key-value pairs
for key, value in d.items():
    print(f"{key}: {value}")

## Tuple
<hr>

A tuple in Python is very similar to a list, the difference is that **a tuple is immutable**, meaning the elements in a tuple can't be changed after creation.

Tuples are created using parentheses `( )` with elements separated by `commas`. Accessing, slicing follows the same syntax as lists:

In [None]:
tup = (13, "zh", 20)
tup[1]

In [None]:
tup[0:2]

In [None]:
tup[0:]

Use `del` to delete the tup， 也可以用运算符 +、*、in、not in：

In [None]:
tup = (13, "zh", 20)
del tup  # fully delete the tuple 'tup'

Operator `+`, `*`, `in`,`not in` can still be used:

In [None]:
tup = (23, 45, 21)
tup = tup + (32, 21)
tup

In [None]:
tup * 2

In [None]:
25 in tup

In [None]:
25 not in tup

## Boolean
<hr>

In Python, Boolean values (Bool) are typically produced through logical comparisons, and there are only two possible outcomes: `True` or `False`.

In [None]:
10 > 3

In [None]:
3 == 4

When making logical comparisions, two equal signs `==` indicate equality, while a single equal sign `=` signifies value assignment.

In [None]:
a = 3  # one euqal sign assigns the value 3 to a 将 a 赋值为 3
a == 4  # two equal signs judging whether the value of a equals 4

Integers and floats can be used as Boolean values with the bool() method. 

- Any number with a value of zero (0 or 0.0) is considered `False` while any non-zero number (positive or negative) is considered True.
- Empty list/dict/tuple/string, i.e., `[]`, `{}`, `()`, `''`, are considered `False`.

In [None]:
bool('')

In [None]:
bool(0.0)

Logical operations involving multiple conditions—namely "and", "or", and "not"—are represented in Python by the keywords `and`, `or`, and `not`, respectively.

In [None]:
10 > 3 and 3 > 2

In [None]:
10 > 3 and 3 > 4

In [None]:
10 > 3 or 3 > 4

In [None]:
not 3 > 4

## Set*[^1]
<hr>

A set is an unordered sequence of unique elements, typically created using curly braces `{ }`.

[^1]: \* means this section may not be delivered in the class.

```{note}
To create an empty set, you must use `set()` instead of `{ }`, because `{ }` defaults to creating an empty dictionary.
```

In [None]:
set1 = {34, 23, "chen"}
set1

In [None]:
set2 = {12, "34", 10}
set2

Set operators are：

|Operator|Meaning|
|-|-|
|`-`|Remove the elements that are common to both sets from the left set|
|`\|`|The union of two sets|
|`&`|The intersetion of two sets|
|`^`|Elements that are not the intersection of the two sets|

In [None]:
set1 = {34, 12, "chen"}
set2 = {12, "wang", 10}
set1 - set2

In [None]:
set1 | set2

In [None]:
set1 & set2

In [None]:
set1 ^ set2

To manipulate elements in a set:
- Use the ``add()`` method to insert a **single element**
- Use the ``update()`` method to add **multiple elements** (can be from a list, tuple, or dictionary)
- Use the ``remove()`` method to delete a specific element
- Check if an element exists in a set using ``in`` or ``not in`` operators:

In [None]:
set = {13, 45, 67}
set.add(10)
set

In [None]:
set.remove(10)
set

In [None]:
set.update([80, 44])
set

In [None]:
44 in set

In [None]:
44 not in set

## `random` library
<hr>

Random numbers are widely used in simulations. Python's built-in random module can generate common pseudorandom numbers (note that all computer-generated random numbers are pseudorandom, true random numbers cannot be artificially produced).

Commonly used methods in `random`:

| Method| Description |
|:--:|:--|
| `seed(a=None)` | Initialize the random number seed (defaults to current system time) |
| `random()` | Generate a float in the range [0.0, 1.0) (left inclusive) |
| `randint(a, b)` | Generate a random integer between [a, b] (both inclusive) |
| `uniform(a, b)` | Generate a random float between [a, b] (both inclusive) |
| `shuffle(seq)` | Shuffle the elements of sequence `seq` in place (returns None) |
| `sample(seq, k)` | Return a list of `k` unique elements randomly selected from sequence `seq` |


The random number seed can be specified using the `seed()` method. As long as the seed remains the same, the sequence of generated random numbers will also be identical.

In [None]:
import random

random.random()  # the generated numbers are different every time running the code when not setting the seed

In [None]:
random.seed(100)
random.random()  # if setting the seed, the generated numbers are identical every time running the code

## Excercises
<hr>

```{exercise}
:label: auto-type
What type of number is 5.2 in Python?

A.&nbsp;&nbsp;  string

B.&nbsp;&nbsp;  integer

C.&nbsp;&nbsp;  complex

D.&nbsp;&nbsp;  float
```

````{solution} bool
:class: dropdown
D
````

```{exercise}
:label: bool
What does bool(0) return?

A.&nbsp;&nbsp;  True

B.&nbsp;&nbsp;  False
```

````{solution} bool
:class: dropdown
B
````

```{exercise-start}
:label: int-value
```
What will be the result of the following code:
```python
print(int(35.88))
```

A.&nbsp;&nbsp;  36

B.&nbsp;&nbsp;  35.8

C.&nbsp;&nbsp;  35

D.&nbsp;&nbsp;  35.88

```{exercise-end}
```

````{solution} int-value
:class: dropdown
C
````

```{exercise-start}
:label: list-type
```
What data type is the object below? 

```python
arr = [1, "brunel", 0]
```

A.&nbsp;&nbsp;  bool

B.&nbsp;&nbsp;  str

C.&nbsp;&nbsp;  list

D.&nbsp;&nbsp;  dict

```{exercise-end}
```


````{solution} list-type
:class: dropdown
C
````

```{exercise-start}
:label: list-slicing
```
What is the output of the following code? 

```python
x = 'Welcome'
print(x[3:5])
```

A.&nbsp;&nbsp;  co

B.&nbsp;&nbsp;  com

C.&nbsp;&nbsp;  lcom

D.&nbsp;&nbsp;  lc

```{exercise-end}
```


````{solution} list-slicing
:class: dropdown
A
````

```{exercise-start}
:label: tuple-plus
```
What is the output of the following code? 

```python
tup = (1, 2, 3) 
print(2 * tup) 
```

A.&nbsp;&nbsp;  (1, 2, 3, 1, 2, 3)

B.&nbsp;&nbsp;  (1, 2, 3, 4, 5, 6)

C.&nbsp;&nbsp;  (2, 4, 6)

D.&nbsp;&nbsp;  (1, 2, 3)

```{exercise-end}
```


````{solution} tuple-plus
:class: dropdown
A
````

```{exercise}
:label: cast

How to convert the string '10' to a float?

A.&nbsp;&nbsp;  int(10)

B.&nbsp;&nbsp;  float(10)

C.&nbsp;&nbsp;  float 10

D.&nbsp;&nbsp;  str(10)

```

````{solution} cast
:class: dropdown
B
````

```{exercise}
:label: string-upper

What is a correct syntax to print a string 'txt' in upper case letters?

A.&nbsp;&nbsp;  'txt'.toupper()

B.&nbsp;&nbsp;  'txt'.to_upper()

C.&nbsp;&nbsp;  'txt'.upper()

D.&nbsp;&nbsp;  'txt'.capital()

```

````{solution} string-upper
:class: dropdown
C
````

```{exercise}
:label: string-strip

How to return the string without any whitespace at the beginning or the end for the string txt = ' United Kingdom '?

A.&nbsp;&nbsp;  'txt'.space()

B.&nbsp;&nbsp;  'txt'.strip()

C.&nbsp;&nbsp;  'txt'.pop()

D.&nbsp;&nbsp;  'txt'.split()

```

````{solution} string-strip
:class: dropdown
C
````

```{exercise-start}
:label: str-plus
````

What is the output of the following codes?
```python
a = 'Hello'
b = 'World'
print(a + b)
```

A.&nbsp;&nbsp;  HelloWorld

B.&nbsp;&nbsp;  Hello World

C.&nbsp;&nbsp;  a + b

D.&nbsp;&nbsp;  'Hello''World'

```{exercise-end}
```

````{solution} str-plus
:class: dropdown
A
````

```{exercise-start}
:label: list-str-output
```
What is the output of the follwing code? 

```python
arr = ["uk","brunel", 100]
print(arr[1][2])
```

A.&nbsp;&nbsp;  100

B.&nbsp;&nbsp;  brunel

C.&nbsp;&nbsp;  k

D.&nbsp;&nbsp;  u

```{exercise-end}
```


````{solution} list-str-output
:class: dropdown
D
````

```{exercise-start}
:label: dict-len
```

What is the output of the following code?

```python
d = {1: 'us', 2: 'uk', 3: 'china', 4: 20}
print(len(d))
```

A.&nbsp;&nbsp;  8

B.&nbsp;&nbsp;  20

C.&nbsp;&nbsp;  6

D.&nbsp;&nbsp;  4

```{exercise-end}
```


````{solution} dict-len
:class: dropdown
D
````

```{exercise}
:label: dict-true

A dictionary cannot have two keys with the same name.

A.&nbsp;&nbsp;  True

B.&nbsp;&nbsp;  False

```

````{solution} dict-true
:class: dropdown
A
````

```{exercise}
:label: bool-true

Which of the following is considered a "True" value in Python?

A.&nbsp;&nbsp;  2

B.&nbsp;&nbsp;  ''

C.&nbsp;&nbsp;  {}

D.&nbsp;&nbsp;  []

```

````{solution} bool-true
:class: dropdown
A
````

```{exercise-start}
:label: dict-access
```
How to access the value of year from the followinng dict?

```python
car =	{
  "brand": "Tesla",
  "year": 2013
}
```

A.&nbsp;&nbsp;  car['year']

B.&nbsp;&nbsp;  car.get('year')

C.&nbsp;&nbsp;  car[2013]

D.&nbsp;&nbsp;   A or B

```{exercise-end}
```

````{solution} dict-access
:class: dropdown
D
````

```{exercise-start}
:label: dict-pop
```
How to remove the pair key-value "year": 2013 from the followinng dict?

```python
car =	{
  "brand": "Tesla",
  "year": 2013
}
```

A.&nbsp;&nbsp;  car.clear()

B.&nbsp;&nbsp;  car.get('year')

C.&nbsp;&nbsp;  car.pop('year')

D.&nbsp;&nbsp;  car.remove('year')

```{exercise-end}
```

````{solution} dict-pop
:class: dropdown
C
````

```{exercise-start}
:label: dict-change
```
How to change the 'type' from 'apple' to 'banana' for the following dict?

```python
dict = {'type' : 'apple', 'name' : 'red2'}
```

A.&nbsp;&nbsp;  dict['type'] = 'banana'

B.&nbsp;&nbsp;  dict('type') = 'banana'

C.&nbsp;&nbsp;  dict{'type'} = 'banana'

D.&nbsp;&nbsp;   dict['name'] = 'banana'

```{exercise-end}
```

````{solution} dict-change
:class: dropdown
A
````

```{exercise}
:label: multi-variable-assign
What is a correct syntax to add the value 'Hello World', to 3 variables in one statement?

A.&nbsp;&nbsp;  x, y, z = 'Hello World'

B.&nbsp;&nbsp;  x | y | z = 'Hello World'

C.&nbsp;&nbsp;  x & y & z = 'Hello World'

D.&nbsp;&nbsp;  x = y = z = 'Hello World'
```

````{solution} multi-variable-assign
:class: dropdown
D
````

```{exercise-start}
:label: dict-generate
```
What is the output of the following codes?

```python
a = {i: i * i for i in range(4)} 
print (a) 
```

A.&nbsp;&nbsp;  {1, 2, 3, 4}

B.&nbsp;&nbsp;  {0, 1, 2, 3}

C.&nbsp;&nbsp;  {0: 0, 1: 1, 2: 2, 3: 3}

D.&nbsp;&nbsp;   {0: 0, 1: 1, 2: 4, 3: 9}

```{exercise-end}
```

````{solution} dict-generate
:class: dropdown
D
````

```{exercise-start}
:label: dict-loop
```
How to loop trough all the values of the followinng dict?

```python
car =	{
  "brand": "Tesla",
  "year": 2013
}
```

```{exercise-end}
```

````{solution} dict-loop
:class: dropdown

```python
for y in car.values():
    print(y)
```
````

```{exercise-start}
:label: dict-loop2
```
How to loop trough all the keys and values of the followinng dict?

```python
car =	{
  "brand": "Tesla",
  "year": 2013,
  "price": 30000
}
```

```{exercise-end}
```

````{solution} dict-loop2
:class: dropdown

```python
for y, z in car.items():
  print(y, z)
```
````

```{exercise}
:label: random

Ranomly sample 3 chracters from the string 'abcedefg'. (tips: use the method in the library 'random')
```

````{solution} random
:class: dropdown
```{code-block} python
import random

random.sample('abcedefg', 3)
```
````

<script src="https://giscus.app/client.js"
        data-repo="robinchen121/book-Python-Data-Science"
        data-repo-id="R_kgDOKFdyOw"
        data-category="Announcements"
        data-category-id="DIC_kwDOKFdyO84CgWHi"
        data-mapping="pathname"
        data-strict="0"
        data-reactions-enabled="1"
        data-emit-metadata="0"
        data-input-position="bottom"
        data-theme="light"
        data-lang="en"
        crossorigin="anonymous"
        async>
</script>

<!-- Toogle google translation -->
<div id="google_translate_element"></div>
<script type="text/javascript">
      function googleTranslateElementInit() {
        new google.translate.TranslateElement({ pageLanguage: 'zh-CN',
                  includedLanguages: 'en,zh-CN,zh-TW,ja,ko,de,ru,fr,es,it,pt,hi,ar,fa',
layout: google.translate.TranslateElement.InlineLayout.SIMPLE }, 'google_translate_element');
      }
</script>
<script type="text/javascript"
      src="https://translate.google.com/translate_a/element.js?cb=googleTranslateElementInit"
></script>
<br>