<a href="https://colab.research.google.com/github/stevenkhwun/P4DS/blob/main/Chp02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Built-In Data Structures, Functions, and Files

## Data Structures and Sequences

### Tuple

A tuple is a fixed-length, immutable sequence of Python objects which, once assigned, cannot be changed.

In [None]:
# Create a tuple (with parentheses)
tup1 = (4, 5, 6)
tup1

(4, 5, 6)

In [None]:
# Create a tuple (without parentheses)
tup2 = 7, 8, 9
tup2

(7, 8, 9)

#### Converting any sequence or iterator to a tuple by invoking `tuple`

In [None]:
# Convert a list into a tuple
tuple([4, 0, 2])

(4, 0, 2)

In [None]:
# Convert a string into a tuple
tup3 = tuple('string')
tup3

('s', 't', 'r', 'i', 'n', 'g')

#### Accessing elements by `[]`

In [None]:
# Accessing elements of tuple
tup3[0]

's'

#### Complicated tuples

In [None]:
# Create complicated tuples by enclosing the values in parentheses
nested_tup = (4, 5, 6), (7, 8)
nested_tup

((4, 5, 6), (7, 8))

In [None]:
# Accessing the value of a complicated tuple
nested_tup[0]

(4, 5, 6)

#### Mutable elements in a tuple

While the objects stored in a tuple may be mutable themselves, once the tuple is created it's not possible to modify which object is stored in each slot:

In [None]:
# Creating a tuple with different type of objects
tup4 = ('foo'), [1, 2], (True)
tup4

('foo', [1, 2], True)

In [None]:
# Another way to create the same tuple
tup5 =  ('foo', [1, 2], True)
tup5

('foo', [1, 2], True)

In [None]:
# Checking equivalence of the tuples
tup4 == tup5

True

In [None]:
# Elements in a tuple cannot be modified
tup4[2] = False

TypeError: ignored

In [None]:
# Modifying an mutable object in a tuple
tup4[1].append(3)
tup4

('foo', [1, 2, 3], True)

#### Concatenating tuples using the `+` operator

In [None]:
# Concatentating tuples
# Note the end , is needed if a tuple contain only one 'string' element
tup6 = (4, None, 'foo') + (6, 0)
tup6

(4, None, 'foo', 6, 0)

In [None]:
# Creating tuple with only one string
# Note the end , is needed if a tuple contain only one 'string' element
k = ('bar',)
print(k)
type(k)

('bar',)


tuple

In [None]:
# Concatentating tuples
tup6 + k

(4, None, 'foo', 6, 0, 'bar')

#### Multiplying a tuple by an integer

In [None]:
# Multiplying a tuple
('foo', 'bar') * 4

('foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar')

# Basic descriptive statistics

In [6]:
# Import necessary library
import pandas as pd
import numpy as np
import scipy.stats
import wquantiles

ModuleNotFoundError: ignored

In [2]:
# Load the dataset
link = "https://raw.githubusercontent.com/stevenkhwun/P4DS/main/Data/state.csv"
state = pd.read_csv(link)
state.head()

Unnamed: 0,State,Population,Murder.Rate,Abbreviation
0,Alabama,4779736,5.7,AL
1,Alaska,710231,5.6,AK
2,Arizona,6392017,4.7,AZ
3,Arkansas,2915918,5.6,AR
4,California,37253956,4.4,CA


In [3]:
state['Population'].mean()

6162876.3

In [5]:
scipy.stats.trim_mean(state['Population'], 0.1)

4783697.125

However, only `a` and `b` are refering to the same object, i.e. the list `[1, 2, 3]` and `c` in fact a copy of `a` created by the function `list()` and is distinct from `a`.

In [None]:
# Compare the object using is and is not
print(a is b)
print(a is c)
print(b is not c)

True
False
True


By appending an element to `a` and then examining `b`, you can see that `a` and `b` are referring to the same object.

In [None]:
# Appending element to a
a.append(4)
print(a)
print(b)

[1, 2, 3, 4]
[1, 2, 3, 4]


# Some important concepts

## Mutable and immutable objects

### Mutable objects

Many objects in Python, such as lists, dictionaries, NumPy arrays, and most user-defined types (classes), are mutable. This means that the object or values that they contain can be modified:

In [None]:
# Mutable object list
a_list = ["foo", 2, [4, 5]]
a_list

['foo', 2, [4, 5]]

In [None]:
# Modified the element in the object
a_list[2] = (3, 4)
a_list

['foo', 2, (3, 4)]

### Immutable objects

Others, like strings and tuples, are immutable, which means their internal data cannot be changed:

In [None]:
# Immutable object tuple
a_tuple = (3, 5, (4, 5))
a_tuple

(3, 5, (4, 5))

When you try to modify an element in the tuple, for example:

```python
a_tuple[1] = "four"
```

The following error message wil return:

```python
TypeError: 'tuple' object does not support item assignment
```

Recall that to access the element in a tuple, you can use the following command, which gives the second element in the tuple.

In [None]:
# Accessing element in a tuple
a_tuple[1]

5

Similarly, for a string, we can use the following commands:

In [None]:
# Accessing element in a string
string = "this is a string"
string[2]     # This is "i"

'i'

In [None]:
# Accessing element in a string
string[4]      # This is a space

' '

# Import custom modules

In Python, a module is simply a file with the `.py` extension containing Python code. Suppose we had the following module:

```python
# some_module.py
PI = 3.14159

def f(x):
    return x + 2
    
def g(a, b):
    return a + b
```

To import a module, the `.py` module file should be put into the same folder or the path of the location of the file be stated in the command:
```python
# In the same folder
import module_name   # .py not necessary

# In different folder
import \Module\module_name   # import from a subfolder Module
```

The following example demonstrates importing a module in the Colab environment and some more steps are needed. Since Colab is a Jupyter Notebook environment working in the cloud and it cannot access the file in the hard drive of a computer, we need first upload the module file to the cloud and make it accessable by the Jupyter Notebook.



## Colab operations

First you need to upload the custom module file to your Google drive. In this example, the `some_module.py` file is uploaded to the sub-directory `\Modules` in the Google drive.

To make the module file accessible by Jupyter Notebook, you need to copy the module file to the working directory of the notebook. To check the working directory of the notebook, you can use the following command:


In [None]:
# Check your Colab temporary path
! pwd

/content


Before you can copy the file to the working directory, you need to mount your Google drive to Google Colab.

In [None]:
# Mount your Google drive to Google Colab
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


You can check the file directory by clicking on the left panel of the Colab environment. You can also check the working directory again.

In [None]:
# Check your Colab temporary path
! pwd

/content


Now you can copy the custom module from Google Drive to Colab temporary drive.

In [None]:
# Copy custom module from Google Drive to Colab temporary drive 
# !cp [path of your custom module] [path where you like to copy]
! cp /content/drive/MyDrive/Modules/some_module.py /content/

## Running the custom module

The module file has now been copied to the Colab temporary drive and you can import the module by the following command:

In [None]:
# Import the module
import some_module

In [None]:
# Run the function in the module
result = some_module.f(5)   # Run f
pi = some_module.PI         # Obtain the constant stated in the module

# Print the results
print(result)
print(pi)

7
3.14159


To directly use the function in the module, use the following command:

In [None]:
# Directly import function in a module
from some_module import g, PI

# Run the function
results = g(1, 10)
print(results)
print(PI)

11
3.14159


Finally, your can also using the `as` keyword to give imports different variable names:

In [None]:
# Use as keyword
import some_module as sm
from some_module import PI as pi, g as gf

# Run the function
r1 = sm.f(pi)
r2 = gf(6, pi)
print(r1)
print(r2)

5.14159
9.14159


This is the end of the document.

In [None]:
# An element appending function
def append_element(some_list, element):
  some_list.append(element)

In [None]:
data = [1, 2, 3]
append_element(data, 100)
data

[1, 2, 3, 100]

In [None]:
b = [1, 2, 3]
b?

In [None]:
a = "foo goo"
a.count("f")

1

In [None]:
type(a)

str

In [None]:
n = 689
type(n)

int

In [None]:
n.real

689

In [None]:
import pandas as pd

In [None]:
df = pd.read_csv("sample_data/mnist_test.csv")