<div align="center"><img src="../images/LKYCIC_Header.jpg"></div>

**Table of contents**<a id='toc0_'></a>    
- [1-01: Data Types and Structures](#toc1_)    
  - [Data Types](#toc1_1_)    
    - [Integer](#toc1_1_1_)    
    - [Float](#toc1_1_2_)    
    - [String](#toc1_1_3_)    
  - [Variable](#toc1_2_)    
    - [Check type](#toc1_2_1_)    
    - [Operations and functions](#toc1_2_2_)    
      - [Numeric Operations](#toc1_2_2_1_)    
      - [String Operations](#toc1_2_2_2_)    
  - [Data Structures](#toc1_3_)    
    - [List](#toc1_3_1_)    
      - [For loop](#toc1_3_1_1_)    
      - [Conditional Statement](#toc1_3_1_2_)    
    - [Indexing system](#toc1_3_2_)    
      - [Add element to the list](#toc1_3_2_1_)    
    - [Dictionary](#toc1_3_3_)    
      - [Create Dictionary from Lists](#toc1_3_3_1_)    
  - [Next step](#toc1_4_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

# <a id='toc1_'></a>[1-01: Data Types and Structures](#toc0_)

We will begin by introducing some of the most commonly used data types in Python, such as **integers, floats, and strings**, which form the foundation for handling different kinds of information in programming.

Understanding these data types is essential for performing calculations, managing text, and working with various types of data.  

In addition to these fundamental data types, we will explore two versatile and widely used data structures: **lists and dictionaries**. 

1. List

    Lists allow you to store and manage collections of items in an ordered sequence, making it easier to perform operations like iteration and indexing. 

2. Dictionary
    
    Dictionaries enable you to store data in a key-value format, offering a flexible way to organise and retrieve information efficiently.

## <a id='toc1_1_'></a>[Data Types](#toc0_)

### <a id='toc1_1_1_'></a>[Integer](#toc0_)

For example,

- Number of students in a class.  
- Population of a city.      
- Employee IDs.    
- House numbers in addresses.     
- Points in a sports game.      
- Distance in metres (when rounded).      
- Views or likes on a social media post (when displayed as whole numbers).  

In [None]:
age = 28

The last line (if it is a object) is printed automatically.

For other lines above, you need to explicitly add print():

In [None]:
age
print(age)

In [None]:
print(age)
age

In [None]:
print(age)
print(age)

### <a id='toc1_1_2_'></a>[Float](#toc0_)

Floats, or floating-point numbers, represent real numbers that include a fractional component. 

- Stock prices (e.g., 154.32 per share)  
- Interest rates (e.g., 3.75%)   
- Latitude and longitude coordinates (e.g., 51.5074, -0.1278)  
- Elevation or depth (e.g., 8848.86 m for Mount Everest)  
- Wind speed (e.g., 10.5 m/s)   

In [None]:
bank_balance = 100.50

In [None]:
print(bank_balance)

### <a id='toc1_1_3_'></a>[String](#toc0_)

For textual information. Strings are incredibly versatile and can represent virtually any kind of **textual or symbolic data**.   
 
- Addresses: "221B Baker Street, London"     
- Sentences or paragraphs from books, articles, or documents    
- Social media posts or tweets  
- Comments: "Great product, highly recommend!"  
- Search queries: "best Python tutorials"  
- Survey responses  
- Error messages: "File not found"     
- Country names: "United Kingdom"  
- City names: "London" 

In [None]:
my_info = "I am 28 years old and I have SG$100.50 in my bank account."

In [None]:
print(my_info)

## <a id='toc1_2_'></a>[Variable](#toc0_)

In Python, a variable is **a named storage location** used to hold data that can be accessed and manipulated during the execution of a program. 

Think of a variable as a "container" that stores a value, which can be changed or reused later.

### <a id='toc1_2_1_'></a>[Check type](#toc0_)

We can use the `type()` **function** to identify the type of a variable.

In [None]:
type(age)

In [None]:
type(bank_balance)

In [None]:
type(my_info)

### <a id='toc1_2_2_'></a>[Operations and functions](#toc0_)

#### <a id='toc1_2_2_1_'></a>[Numeric Operations](#toc0_)

`I am one year older, and I made 50 SGD in the last year`. Then:

In [None]:
print(age, bank_balance)

In [None]:
age + 1

In [None]:
bank_balance  + 50

In [None]:
print(age, bank_balance)

We can see the values of these variables remain unchanged no matter which operations conducted. 

To actually change the value of these variables. We need to use '=' to assign the new value to the variables:

In [None]:
age = age + 1
bank_balance = bank_balance + 50

In [None]:
print(age, bank_balance)

Float <--> integer

In [None]:
round(bank_balance)

In [None]:
type(round(bank_balance))

In [None]:
round(1.566668, 3)

In [None]:
type(round(1.566668, 3))

In [None]:
# int to float
float(age)

*Note:* Hash symbol `"#"` is the way of commenting things in Python.

In [None]:
# float to int
int(bank_balance)

#### <a id='toc1_2_2_2_'></a>[String Operations](#toc0_)

Split string varibale:

`Task:` Extract the textual address and postcode from the full address

In [None]:
address = "Somapah Rd, Singapore 487372"

In [None]:
address.split(",")

In [None]:
rd_name, post_code = address.split(",")

print("The road name is " + rd_name + " and the postal code is " + post_code)

In [None]:
rd_name, post_code = address.split(", Singapore ")

print("The road name is " + rd_name + " and the postal code is " + post_code)

**F-strings**

F-strings is a way to format strings in Python. It is a way to embed expressions inside string literals, using curly braces `{}`.

In [None]:
print(my_info)

In [None]:
print(f"I am {age} years old and I have SG${bank_balance} in my bank account.")

In [None]:
print(f"I am {age} years old and I have SG${bank_balance:.3f} in my bank account.")

Another way is using string format() function:

In [None]:
print("I am {age} years old and I have SG${bank_balance} in my bank account.".format(age=age, bank_balance=bank_balance))

In [None]:
print("I am {age} years old and I have SG${bank_balance:.3f} in my bank account.".format(age=age, bank_balance=bank_balance))

Formatted strings (or f-strings) in Python offer a powerful and dynamic way of incorporating variables and expressions into strings. 

Personally, I use `F-Strings` more often, because it is more straightforward.

## <a id='toc1_3_'></a>[Data Structures](#toc0_)

If you are familiar with **vectors, data frames, and lists in R**, learning their equivalents in Python will be straightforward.  

For those who are completely new to programming, these data structures are also quite easy to understand.  

### <a id='toc1_3_1_'></a>[List](#toc0_)

We specify a list with square brackets: `[]` and **commas separating each entry** in the list:

| R                                                            | Python                                                       |
| ------------------------------------------------------------ | ------------------------------------------------------------ |
| Vectors                                                      | Lists                                                        |
| A vector is a one-dimensional, homogeneous data structure (all elements must be of the same type). | A list in Python is one-dimensional but heterogeneous (can contain different data types). |
| numeric_vector <- c(1, 2, 3)<br/>character_vector <- c("a", "b", "c") | numeric_list = [1, 2, 3]<br/>mixed_list = [1, "a", True]     |

Python lists are more flexible with types.

In [None]:
asean_country_list = ['Brunei Darussalam', 'Cambodia', 'Indonesia', 'Lao PDR', 'Malaysia', 'Myanmar', 'Philippines', 'Singapore', 'Thailand', 'Vietnam']

type(asean_country_list)

`len()` gives us the number of items in a list. How many countries in the list of Asean countries?

In [None]:
len(asean_country_list)

#### <a id='toc1_3_1_1_'></a>[For loop](#toc0_)

```python
for variable in sequence:
    # Code to execute
```

Loop print the country name

In [None]:
for country in asean_country_list:
    print(country)

Find the country whose name start with 'M'

In [None]:
for country in asean_country_list:
    print(country)
    print(country.startswith('M')) # This is a conditional statement to check if the country name starts with 'M'
    print('---')

#### <a id='toc1_3_1_2_'></a>[Conditional Statement](#toc0_)

Any conditional statement will return one of these two values (True or False) to indicate whether the condition is valid.

The `if-else` statement is the most common conditional structure used.

You can add an `if condition` to your loop to decide whether the country name starts with 'M'. Print different messages depending on the result of the condition:  

In [None]:
'Singapore'.startswith('M')

Yes, in computer systems, the concepts of True and False are represented numerically:

1. `True` is represented as 1.

2. `False` is represented as 0.

In [None]:
'Singapore'.startswith('S') + 2

In [None]:
'Singapore'.startswith('M') + 2

For if-else statement, the syntax is as follows:

In [None]:
if 1:
    print('This is True')  # Executed because `1` is truthy.
    
if True:
    print('This is also True')  # Executed because `True` is truthy.
    
if 5:
    print('This can be True')  # Executed because any non-zero number is truthy.


In [None]:
if 0:
    print('This is False')  # Not executed because `0` is falsy.

if False:
    print('This is also False')  # Not executed because `False` is falsy.

if '':
    print('This is False')  # Not executed because an empty string is falsy.

if []:
    print('This is False')  # Not executed because an empty list is falsy.

To summarise:

The if condition is executed when the value is truthy, such as:

1. Non-zero numbers (e.g., 1, 5, -3).

2. The boolean value True.

The if condition is not executed when the value is falsy, such as:
1. 0 (zero).

2. The boolean value False.

3. Other falsy values like None, empty strings (""), empty lists ([]), etc.

In [None]:
for country in asean_country_list:
    print(country)
    if country.startswith('M') == True:
        print(f"{country} starts with M")
    else:
        print(f"{country} does not start with M")
    print('---')

In programming, `True` is represented as 1, while `False` is represented as 0.

So you can replace the `True` with number 1, it will return the same output:

In [None]:
for country in asean_country_list:
    print(country)
    if country.startswith('M') == 1:
        print(f"{country} starts with M")
    else:
        print(f"{country} does not start with M")
    print('---')

## Function

Functions are a very important feature in almost every programming language because they allow us to reuse previously defined code.

```python
def function_name(input):

    '''
    Every code inbetween
    '''

    return output
```

In [None]:
age = 28
bank_balance = 100.50

In [None]:
print(age, bank_balance)

In [None]:
current_year = 2025
future_year = 2070

age = age + (future_year - current_year)
bank_balance = bank_balance + (future_year - current_year) * 50

print(age, bank_balance)

`Task`: Can you make a function to calculate the balance of the year if my bank_balance increases 50 dollor every year? 

In [None]:
age = 28
bank_balance = 100.50
current_year = 2025
future_year = 2070
annual_savings = 50

def calculate_future_age_and_bank_balance(age, bank_balance, current_year, future_year, annual_savings):
    age = age + (future_year - current_year)
    bank_balance = bank_balance + (future_year - current_year) * annual_savings
    return age, bank_balance

age, bank_balance = calculate_future_age_and_bank_balance(age, bank_balance, current_year, future_year, annual_savings)

print(f"By the year {future_year}, I will be {age} years old and have SG$ {bank_balance} in my bank account.")

`Challenge`: Can you make a function `calculate_future_balance` to calculate the balance of the year if my bank_balance increases 10% every year?

In [None]:
age = 28
bank_balance = 100.50
current_year = 2025
future_year = 2080
growth_rate = 0.1

#——————————————————————————————————————————————————————————————————————————————————————————————#


#——————————————————————————————————————————————————————————————————————————————————————————————#

age, bank_balance = calculate_future_balance(age, bank_balance, growth_rate, current_year, future_year)

print(f"By the year {future_year}, I will be {age} years old and have SG$ {bank_balance} in my bank account.")

### <a id='toc1_3_2_'></a>[Indexing system](#toc0_)

Python is **zero-indexed**, which means that the indexing of elements in sequences (such as lists, tuples, and strings) starts from 0. 

In [None]:
print(asean_country_list)

In [None]:
print(asean_country_list[1])

The first element of a sequence has an index of 0, the second element has an index of 1, and so on.

In [None]:
print(asean_country_list[0])

Python also supports **negative indexing**, which allows you to access elements from the end of the sequence. The last element is at index -1, the second-last at -2, and so on.

In [None]:
print(asean_country_list[-1])

We can also get multiple items from a list. We specify the start index and the end index, separated by a colon `[start:stop]`. 

Start index is included but not the stop index.

In [None]:
print(asean_country_list[1:3]) # The second element and the third element

In [None]:
print(asean_country_list[1], asean_country_list[2])

If one side of the colon is empty, it indicates using one end of the list as the starting or ending points. 

In [None]:
print(asean_country_list[1:])

In [None]:
print(asean_country_list[:-3])

#### <a id='toc1_3_2_1_'></a>[Add element to the list](#toc0_)

Append one country 'Japan' to the list:

In [None]:
asean_country_list.append('Japan')

It will add the new element to the last:

In [None]:
print(asean_country_list)

Remove 'Japan' from the list:

In [None]:
asean_country_list.remove('Japan')

In [None]:
print(asean_country_list)

### <a id='toc1_3_3_'></a>[Dictionary](#toc0_)

Dictionaries are organized in pairs of keys and values. 

The **keys** can be used to access the **values**.

| R                                                            | Python                                                       |
| ------------------------------------------------------------ | ------------------------------------------------------------ |
| Lists                                                        | Dictionaries                                                 |
| A list is a versatile structure that can contain elements of different types, including other lists. | A dictionary stores key-value pairs, offering functionality somewhat akin to named lists in R. |
| lst <- list(name = "Alice", age = 25, scores = c(90, 85, 88)) | person = {"name": "Alice", "age": 25, "scores": [90, 85, 88]} |

Dictionaries are specified in Python using `{}`. **Colons separate the keys and values**. 

Let's take a look at an example dictionary:

In [None]:
# An example dictionary
sg_info = {
    'country': 'Singapore',
    'year': 2025,
    'population': 5870750}

In [None]:
sg_info['year']

In [None]:
print(f"The country is {sg_info['country']} and the population is {sg_info['population']} at the year {sg_info['year']}.") # This is an example of string interpolation

#### <a id='toc1_3_3_1_'></a>[Create Dictionary from Lists](#toc0_)

In [None]:
population = [434000, 16250000, 273523621, 7123205, 32365999, 54339766, 108116615, 5870750, 69950807, 104256076]

`zip()` is used to **combine multiple iterables** (such as lists, tuples, etc.) into **a single iterable of tuples**. 

Each tuple contains elements from the input iterables that are at the same position.

*Note*: For iterable and interator in Python, you can read [analyticsvidhya: Iterables and Iterators](https://www.analyticsvidhya.com/blog/2021/07/everything-you-should-know-about-iterables-and-iterators-in-python-as-a-data-scientist/).

In [None]:
zipped = zip(asean_country_list, population)
print(zipped)

You cannot directly print a zip object. Because it does not directly saved in the momory.

What you can do it is converting it to a collection like `list`, then you can print it properly.

In [None]:
print(list(zipped))

zip object is not saved in memory. 

Once you iterate over it, the items are consumed and cannot be re-iterated unless you create a new zip object.

In [None]:
print(list(zipped))

Other common on-the-fly functions are:

1. `range()`: Generates a sequence of numbers.

2. `enumerate()`: Adds a counter to an iterable and returns it as an enumerate object.

3. `map()`: Applies a function to all items in an input list.

4. `filter()`: Filters items out of an iterable based on a function.

In [None]:
# transform dictionary to country:population key-value pair
asean_country_info = dict(zip(asean_country_list, population))

In [None]:
print(asean_country_info)

**How to make the printed output more readable?**

*Trick*: Pretty print `pprint` will display text data and other data types in a more readable manner.

In [None]:
from pprint import pprint # pretty print

In [None]:
pprint(asean_country_info)

In [None]:
asean_country_info['Singapore']

`Challenge`: Create a dictionary `SUTD`; it has four keys, and their values are:

- number of student : 1000,

- number of faculty : 100,

- location : Changi,

- year established : 2009

Using `f-string` to print a string "SUTD has 1000 students and 100 faculty members. It is located at Changi and was established in 2009."

In [None]:
#——————————————————————————————————————————————————————————————————————————————————————————————#


#——————————————————————————————————————————————————————————————————————————————————————————————#

## <a id='toc1_4_'></a>[Next step](#toc0_)

Go to [1-02: DataFrame and GeoDataFrame](./1-02_dataframe_geo.ipynb)