In [5]:
import numpy as np
import pandas as pd

# [Lists](https://docs.python.org/3/tutorial/introduction.html#lists)

- A list of comma-separated values (items) between square brackets
- Might contain items of different types
- Mutable
- Manipulation: slicing, indexing, concatenation
- List methods:
    - `append()`
    - `extend()`
    - `insert()`
    - `remove()`
    - `pop()`
    - `sort()`
    - `reverse()`
    - `copy()`
    - etc.
- List comprehensions

# [Tuples](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences)

- Consist of a number of values separated by commas, **often parentheses are necessary**
- Very similar to lists, but are **immutable**
- Usually contain a heterogeneous sequence of elements
- Used for sequence unpacking

# [Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries)

- A set of key: value pairs, with the requirement that the keys are unique
- Indexed by keys, which can be any immutable type (always strings or numbers)
- Manipulation: 
    - Store a value with some key
    - Extract the value given the key
    - Delete a key
- Methods:
    - `keys()`
    - `values()`
    - `items()`

# [Arrays](https://numpy.org/doc/stable/user/absolute_beginners.html#what-is-an-array)

- A grid of values, contain information about the raw data, how to locate an element, and how to interpret an element
- The elements are all of the same type, referred to as the array `dtype`
- NumPy arrays are faster and more compact than Python lists, it provides a mechanism of specifying the data types
- Can be initialized from Python lists or from a sequence of elements (0's, 1's or random)
- A **vector** is an array with a single dimension (there’s no difference between row and column vectors), while a **matrix** refers to an array with two dimensions. For 3-D or higher dimensional arrays, the term **tensor** is also commonly used

# [Dataframes](https://pandas.pydata.org/docs/getting_started/intro_tutorials/01_table_oriented.html#pandas-data-table-representation)

![Representation](https://pandas.pydata.org/docs/_images/01_table_dataframe.svg)
- A 2-dimensional data structure that can store data of **different types** (including characters, integers, floating point values, categorical data and more) in columns
- Can be inilized from Python dictionaries or 2-D arrays (and Python nested lists)

# [Series](https://pandas.pydata.org/docs/getting_started/intro_tutorials/01_table_oriented.html#each-column-in-a-dataframe-is-a-series)

![Representation](https://pandas.pydata.org/docs/_images/01_table_series.svg)
- Each column in a DataFrame is a Series, when selecting a single column of a pandas DataFrame, the result is a pandas Series
- Can be inilized from Python lists or 1-D arrays

# Convertion between data types

## From list to...

In [16]:
a = [1,2,3,4,5]
print(a)
type(a)

[1, 2, 3, 4, 5]


list

### Tuple

In [17]:
a_tuple = tuple(a)
print(a_tuple)
type(a_tuple)

(1, 2, 3, 4, 5)


tuple

### Array

In [18]:
a_array = np.array(a)
print(a_array)
type(a_array)

[1 2 3 4 5]


numpy.ndarray

### Series (The first column is index, starting from 0)

In [19]:
a_series = pd.Series(a)
print(a_series)
type(a_series)

0    1
1    2
2    3
3    4
4    5
dtype: int64


pandas.core.series.Series

### DataFrame (for nested lists)

In [15]:
b= [[1,2],[3,4]]
b_df = pd.DataFrame(b)
print(b_df)
type(b_df)

   0  1
0  1  2
1  3  4


pandas.core.frame.DataFrame

## From tuple to...

In [20]:
c = (1,2,3,4)
print(c)
type(c)

(1, 2, 3, 4)


tuple

### List

In [21]:
c_list = list(c)
print(c_list)
type(c_list)

[1, 2, 3, 4]


list

### To others: exactly the same as list

## From dictionary to...

In [23]:
d = {'a':'b', 'c':'d', 'e':'f', 1:2}
print(d)
type(d)

{'a': 'b', 'c': 'd', 'e': 'f', 1: 2}


dict

### List (keys)

In [24]:
d_keys_list = list(d.keys()) # Or list(d)
print(d_keys_list)
type(d_keys_list)

['a', 'c', 'e', 1]


list

### List (values)

In [25]:
d_values_list = list(d.values())
print(d_values_list)
type(d_values_list)

['b', 'd', 'f', 2]


list

### Tuple (same as keys/values)

In [26]:
d_keys_tuple = tuple(d.keys())
print(d_keys_tuple)
type(d_keys_tuple)

('a', 'c', 'e', 1)


tuple

### DataFrame (The value of dictionary should be list or tuple)

In [29]:
e = {'a':['b','c'], 'd':['e','f'], 'g':['h','i']}
e_df = pd.DataFrame(e)
print(e_df)
type(e_df)

   a  d  g
0  b  e  h
1  c  f  i


pandas.core.frame.DataFrame

### Series (One key-value pair)

In [30]:
e_series = pd.Series(e['a'])
print(e_series)
type(e_series)

0    b
1    c
dtype: object


pandas.core.series.Series

## From array to...

In [32]:
f = np.arange(9).reshape(3,3)
print(f)
type(f)

[[0 1 2]
 [3 4 5]
 [6 7 8]]


numpy.ndarray

### List

In [33]:
f_list = list(f)
print(f_list)
type(f_list)

[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]


list

### Tuple: similar to list

### DataFrame

In [34]:
f_df = pd.DataFrame(f)
print(f_df)
type(f_df)

   0  1  2
0  0  1  2
1  3  4  5
2  6  7  8


pandas.core.frame.DataFrame