In [1]:
import pandas

data = pandas.read_csv("weather_data.csv")

In [2]:
type(data)

pandas.core.frame.DataFrame

In [3]:
type(data["temp"])

pandas.core.series.Series

The data type under the pandas module are:
 - `DataFrame`: some kind of the equivalent of the whole tab, every tab or cells under the csv files will be considered as a data frame under Pandas module
 - `Series`: series is basically equivalent to a list under Pandas module, it's kind of like a single column in the table. For example: the temp column in the CSV = list column to Pandas
     - `CSV "temp" columns -> temp = [12,14,15,14,21,22,24]`
     
The complete documentation about pandas data structure can be explored [here](https://pandas.pydata.org/docs/user_guide/dsintro.html#dataframe)

    - Below cell is another example of pandas methods under the dataframe section that converting the CSV table contents into a dictionary, which using the `header` as the `key` and the cell contents as a `value`

<font color = "magenta">var = *assigned_CSV_variable*.***to_dict()***</font>

In [4]:
dict = data.to_dict()
dict

{'day': {0: 'Monday',
  1: 'Tuesday',
  2: 'Wednesday',
  3: 'Thursday',
  4: 'Friday',
  5: 'Saturday',
  6: 'Sunday'},
 'temp': {0: 12, 1: 14, 2: 15, 3: 14, 4: 21, 5: 22, 6: 24},
 'condition': {0: 'Sunny',
  1: 'Rain',
  2: 'Rain',
  3: 'Cloudy',
  4: 'Sunny',
  5: 'Sunny',
  6: 'Sunny'}}

    - And below cell is another example of pandas methods under the Series section that converting the CSV table contents into a list, which using the cell contents directly as a "value"

<font color = "lime">var = *assigned_CSV_Series*.***to_list()***</font>

In [5]:
lst = data["temp"].to_list()
lst

[12, 14, 15, 14, 21, 22, 24]

# Working with CSV data

## Case 1: Getting average CSV value

### 1. Average CSV column with total data / data count

Obtaining average from CSV can be done by these approach:
1. Assign the variable from CSV files using pandas
2. Assign CSV column into panda Series using `.to_list()` function
3. Calculate the assigned variable
    - total data / data count
    - `sum(list_var) / len(list_var)`

In [6]:
sum(lst) / len(lst)

17.428571428571427

### 2. Average CSV column using `Series.mean()` under ***Series panda method***

In [7]:
data["temp"].mean()

17.428571428571427

Previous method can be done by these approach:
1. Assign the variable from CSV files using pandas
2. Get the CSV column that will be calculated
3. Calculate the average using `.mean()` method

Source: [pandas.Series.mean](https://pandas.pydata.org/docs/reference/api/pandas.Series.mean.html#pandas-series-mean)

## Case 2: Getting the maximum value from the CSV

Explanation:
1. Assign the variable from CSV files using pandas
2. Get the CSV column that will be calculated
3. Calculate the average using `.max()` method

Source: [pandas.Series.max](https://pandas.pydata.org/docs/reference/api/pandas.Series.max.html)

In [8]:
data["temp"].max()

24

## Case 3: Accessing the CSV files using OOP method

In [9]:
data["condition"]

0     Sunny
1      Rain
2      Rain
3    Cloudy
4     Sunny
5     Sunny
6     Sunny
Name: condition, dtype: object

Previous cell is method yang menggunakan `[]` sebagai cara untuk memanggil Series dari csv, dengan approach sebagai berikut:
1. Assign variable dari pandas module untuk membaca data CSV `data = pandas.read_csv("weather_data.csv")`
2. Call the CSV values as a series using `[]` under the assigned variable `data["condition"]`

Accessing series also can be done by using another approaches like the sample below:

In [10]:
data.condition

0     Sunny
1      Rain
2      Rain
3    Cloudy
4     Sunny
5     Sunny
6     Sunny
Name: condition, dtype: object

Previous cell explanation:
1. Assign variable dari pandas module untuk membaca data CSV `data = pandas.read_csv("weather_data.csv")`
2. Call the CSV values using class object calling method

<font color = "lime">dataframe`.`CSV_header/title_names</font><br>
`data.condition`

Both of the methods have one weakness:
- If the assigned variable is having different text (including text case), there will be an error like this:

In [11]:
data.Condition

AttributeError: 'DataFrame' object has no attribute 'Condition'

## Case 4: Accessing CSV specific row / column

### 1. Accessing all values in the single row

Explanation:
- Assign variable dari pandas module untuk membaca data CSV `data = pandas.read_csv("weather_data.csv")`
- Get the row details by these method `dataframe_name[dataframe_name.header_name == "specific_row/column_name"]`

In [12]:
data[data.day == "Friday"]

Unnamed: 0,day,temp,condition
4,Friday,21,Sunny


### 2. Accessing all values from specific conditions
- Scenario: accessing all data with the max temperature

In [13]:
data[data.temp == data.temp.max()]

Unnamed: 0,day,temp,condition
6,Sunday,24,Sunny


Explanation:
- Assign variable dari pandas module untuk membaca data CSV `data = pandas.read_csv("weather_data.csv")`
- Get the row details by these method <br>
<font color = "magenta">***dataframe[dataframe.header == dataframe.header.`max()`]***</font>

- Scenario: accessing all data with the *Sunny* weather

In [14]:
data[data.condition == "Sunny"]

Unnamed: 0,day,temp,condition
0,Monday,12,Sunny
4,Friday,21,Sunny
5,Saturday,22,Sunny
6,Sunday,24,Sunny


Explanation:
- Assign variable dari pandas module untuk membaca data CSV `data = pandas.read_csv("weather_data.csv")`
- Get the dataframe details by these method `dataframe[dataframe.Series == "specific value"]`

### 3. Accessing specific value in the single row

Explanation:
- Assign variable dari pandas module untuk membaca data CSV `data = pandas.read_csv("weather_data.csv")`
- Assign variable baru untuk mengambil value dari single row <br>
`var1 = dataframe[dataframe.Series == "specific_value"]` <br>
`var1.CSV_header_conditions`

In [15]:
friday = data[data.day == "Friday"]
friday.condition

4    Sunny
Name: condition, dtype: object

#### 3.2. Adjusting value based on CSV parameter
    Scenario: Converting Friday temp into Farenheit

In [16]:
fr_temp = int(friday.temp)
fr_farenh = (fr_temp * 9/5) + 32
fr_farenh

69.8

# Create dataframe from scratch

In [17]:
hero_list = {
    "class": ["Warrior", "Archer", "Sorceress", "Cleric", "Kali"],
    "weapon": ["sword", "bow", "staff", "mace", "fan"],
    "score": [87, 75, 89, 64, 47]
}

    Creating Table based on given values

Explanation:
1. Define the dictionary contents
2. Assign the Pandas create dataframe method into variable <br>
`var = pandas.DataFrame(dictionary_name)`
3. Call the variable

In [18]:
heroes = pandas.DataFrame(hero_list)
heroes

Unnamed: 0,class,weapon,score
0,Warrior,sword,87
1,Archer,bow,75
2,Sorceress,staff,89
3,Cleric,mace,64
4,Kali,fan,47


In [19]:
print(heroes)

       class weapon  score
0    Warrior  sword     87
1     Archer    bow     75
2  Sorceress  staff     89
3     Cleric   mace     64
4       Kali    fan     47


    Creating csv files based on given values

Explanation:
1. Define the dictionary contents
2. Call the `.to_csv()` method to create CSV file <br>
`dictionary_name.to_csv("file_names_including_location")`
4. Define the CSV file name & the location<br>
if the location not given, will be created under current working directory

In [None]:
heroes.to_csv("heroes_list.csv")

In [None]:
heroes.to_csv("C:\\Users\\M. Haris B. Satriyo\\Desktop\\heroes_list.csv")