**My First Python Demo Notebook – A Simple Tutorial**
<details>

<summary>Table of content</summary>

# Notebook content
- I : Visual Studio code
- II : JupyterNotebook
- III : Python
- IV : Variable type
- V : NumPy and Pandas
- VI : Data from csv
- VII : Manipulation of dataframe
- VIII : Plot of dataframe
- IX : Pandapower

</details>

# VS code



File -> Activate Autosave

Launch via terminal, nice to open project.

Some features : 
- Dark mode or other
- Friendly git command
  - Git [nomenclature](https://www.conventionalcommits.org/en/v1.0.0/)
- Github Copilot
- Extension

# JupyterNotebook

All the [basics](00_welcome_to_jupyter.ipynb) about JupyterNotebook.
- What is it ?
- Step by step execution
- How memory work
- Variable explorer
- Export


Here you have the [Basic writing and formatting syntax](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax) for markdown.

Few usefull shortcut :
- `y` and `m` to toggle between markdown and python cells.
- `Ctrl+Enter` to run cells
- `Shift+Enter` to run and advance in next cells

And text higligh : 
 - ``---`` - Line througth the page
 - `>` - Text box
 - \`Text\` - Quoting code like ``this``
 - And triple for multiple line \`\`\` ``code`` \`\`\`

# Some basics to get ready


At the first, we are not informatician, the hygiene is important for your code : commenting, good variable name,...

But if that works... ;)

Here a complete guide about hygiene coding tips : [hygiene_tips](https://testdriven.io/blog/clean-code-python/), I recommend you to read the _PEP 8_ part.

For solving a problem, you have many solution and that the same with Python, so we will give you one solution (sometimes more) and that can be the best solution, or not. 

So be free as a bird and make your own experience. :bird:

### Python



Python is an object oriented programming language and this mean that almost everything is an object with own properties and methods.

And we create object with the constructure (like blueprint) called Class.

And each object can have function inside, thats called method.

## Help once the package has been imported


Python works with package, this is all the code already developped that we can use, we don't want (and can't) recode anything every times.

Package exit to do lots of thinks : statistical analysis, data visualization, etc.

In this tuto we'll import the following packages :
- *numpy* and *pandas* for data analysis and manipulation
- *plotly* for data visualization
- (*os* is used to change working directories, but you don't need it)

In [1]:
import os
import numpy as np
import pandas as pd
import plotly.express as px
from plotly.subplots import make_subplots

For needed information on a function tape : _?function_name_

In [6]:
?print

[0;31mSignature:[0m [0mprint[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0msep[0m[0;34m=[0m[0;34m' '[0m[0;34m,[0m [0mend[0m[0;34m=[0m[0;34m'\n'[0m[0;34m,[0m [0mfile[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mflush[0m[0;34m=[0m[0;32mFalse[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Prints the values to a stream, or to sys.stdout by default.

sep
  string inserted between values, default a space.
end
  string appended after the last value, default a newline.
file
  a file-like object (stream); defaults to the current sys.stdout.
flush
  whether to forcibly flush the stream.
[0;31mType:[0m      builtin_function_or_method

## Basics Data Structures and Sequences : Variable type

The information is stocked in variable, the information can take multiple forms, numeric, string, etc., the next table resume the data type and how to construct it.
|                  | Data Type  | Example                                              |
|------------------|------------|------------------------------------------------------|
| _Text Type_      | str        | `"Hello World"`                                      |
| _Numeric Types_  | int        | 50                                                   |
|                  | float      | 50.5                                                 |
|                  | complex    | 10j                                                  |
| _Sequence Types_ | list       | `["resistance", "admittance", "impedance"]`          |
|                  | tuple      | `("resistance", "admittance","impedance")`           |
|                  | range      | `range(8)`                                           |
| _Mapping Type_   | dict       | `{"Element": "Generator", "Power" : 100}`            |
| _Boolean Type_   | bool       | True, False                                          |

On this [link](https://wesmckinney.com/book/python-builtin#tut_data_structures), you have more information about the datastructure.

<details>
<summary>Details on variable type</summary>

- **Integers** are whole numbers, negative and positive.
- **Floats** are real numbers or numbers with a decimal point.
- **Strings** are sequences of characters.
- **Booleans** are True or False  (1 or 0).
- **Tuples** are ordered, immutable collections of values.
- **dictionaries** is unordered collection of data in a key: value pair form
- **Lists** are ordered, mutable collections of values.


</details>


The program auto define the variable type when we create it.  It's possible to manually specify it (for changing size p.ex.).

In [10]:
# List exemple
list_test = (9, 1, 2, 3, 4, "z", "x", "y", "x")
print("Classe of list_Test:\n", type(list_test))
print("list_Test contains:\n", list_test)

Classe of list_Test:
 <class 'tuple'>
list_Test contains:
 (9, 1, 2, 3, 4, 'z', 'x', 'y', 'x')


In a sequence, the elements can be accessed with `[]`.

Like C or most of the common programming language, Python index begin at 0, unlike Matlab, who start at 1.

In [11]:
print("Element 3 in list_Test =", list_test[1])

Element 3 in list_Test = 1


List and Tuple are similar, but Tuple is immutable.

In [12]:
list_exemple = ["foo", 1, [1, 2, 4], True]

In [14]:
print("Type of list :", type(list_exemple[0]))

Type of list : <class 'str'>


In [121]:
# tup[2] = False

To access value inside, the procedure is the same :

In [17]:
list_exemple[2][0]

1

To access sub-element :

### Complex number
A complex value can be expressed in polar coordinates or rectangular coordinates :

![Impedance](plot/impedance.png)

In [25]:
# Rectangular
R_real = 10
X_imag = 5
Z = R_real + 1j * X_imag
# Polar
Tension_Module = 10
Phase_Angle = 45
U = Tension_Module * np.exp(1j * Phase_Angle)

But Python express complex number in rectangular coordinates :

In [28]:
print(U)

(5.253219888177298+8.509035245341185j)


## Manipulating NumPy array and Pandas dataframe

For manipulating data, we can use two packages, they are similar but not exactly the same, here a small explanation of the differences [pd/np](https://www.geeksforgeeks.org/difference-between-pandas-vs-numpy/#:~:text=Difference%20between%20Pandas%20and%20Numpy)

Pandas is based on NumPy and is usefull to manipulate data-structures (Tabular data) : numeric data (or not) and time series

Numpy is the fundamental library of Python, it provides high-performance multidimensional arrays and tools to work with them, usefull to matrix and vector calculation.

### Numpy
_array_ is like _list_ but faster, from NumPy library (continuous place in memory unlike _list_)

Use _list_, _tuple_ or any _array-like_ to construct NumPy array

In [None]:
# Create narray with list and multiple dimension 3*2

array_list = np.array(
    [[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]],[[13,14,15],[16,17,18]]]
)

np.array(

    [
        [[1, 2, 3], [4, 5, 6]],
        [[7, 8, 9], [10, 11, 12]],
        [[13, 14, 15], [16, 17, 18]],
    ]
)
print(#ndim)
print(#array)

In [None]:
# Access element in array

In [None]:
# Time interval [1900 to 2020] with step 10 with numpy arange
years = np.arange(1900, 2021, 10)
print(years)

# Population array data for Switzerland : 3297641,3708897,3870407,4042131,4233223,4692642,5327776,6180925,6319848,6713167,7184003,7825751,8640582,
pop = 

Now I would like to visualize the population evolution that will change over time:

In [None]:
# Create a DataFrame
df = pd.DataFrame({"Year": years, "Population": pop})

# Create a line plot
fig = px.line(
    df,
    x="Year",
    y="Population",
    title="Population Evolution of Switzerland",
    labels={"Population": "Population", "Year": "Year"},
)

# To format and save figures
fig.update_layout(
    margin=dict(l=60, r=60, t=60, b=60),
    coloraxis_showscale=False,  # Remove the color bar
    width=800,  # Width of the figure in pixels
    height=500,  # Height of the figure in pixels
)

# Show the plot
fig.show()

### Dataframe
A dataframe (df) contain a dictionary, it's a list of values linked to keys. The size of the df is define by the number of values (row) and keys (columns).

When a df is created, an index is autocreate if we don't specify one, it contain a list from 0 to n-1 (n = number of values).

In [131]:
# Dictionnary value : "key 1": [1, 2, 3, 4, 5],"key 2": [1, 2, 3, 4, 5],"key 3": 3,"key 4": ["a", "b", "c", "D", "E"],
exemple_of_dictionnary = 
pd_exemple = pd.DataFrame(exemple_of_dictionnary)
##Add a column year : 2010, 2011, 2012, 2013, 2014
pd_exemple

##Change a column to index year with set_index
pd_exemple_2

In [None]:
print("Base index :\n", pd_exemple)
print("Custom index :\n", pd_exemple_2)

> **Note** In a df, each column is a series. A series is a one-dimensional array containing data of any type.

In [None]:
# type

#### Access to values 

Access the row value(s) with `loc`:

In [None]:
print("Index access to row 0:\n", pd_exemple)
print("Index access to row 2010:\n", pd_exemple_2)
print("Index access to multiple row 2010 to 2012 :\n", pd_exemple_2)

Access to the column value(s) :

In [None]:
print("Access to column key 1:\n", pd_exemple)
print("Access to column key 1:\n", pd_exemple_2)
print("Access to column year with other methode:\n", pd_exemple)
print("Access to multiple columns with list:\n", pd_exemple)

<details>
<summary>Name column problem</summary>

The method to access a column can the name of the column, and some times the name contain space, in this case. The second method can't work, this [link](https://saturncloud.io/blog/how-to-access-pandas-columns-with-spaces-in-column-names/) explain the problem.

</details>

## Use external data files like csv

Now I would like to use the same value but from a csv file, and only save the value from Switzerland, not for all the countries. Also charging the electricity values.

In [None]:
# Go to the good workplace folder
# os.chdir("..")
os.chdir(os.getcwd().replace("/tutorials", ""))
print("We are here", os.getcwd())
os.chdir("tutorials")
print("We are here", os.getcwd())

In [76]:
# Load data from our world in data
# with read_csv method from pandas : data/population-and-demography.csv and separate with ","
df_population = 
# csv : data/electricity-generation.csv
df_electricity_generation = 

# Take only swiss data = row selection
## Switzerland in Entity column in df_population
df_swiss_population = 
## Switzerland in Entity column in df_electricity_generation
df_swiss_electricity_generation =

The first thing when we create a dataframe from csv file or other sources, is to check the data with `info` in `df_swiss_population`:

After we can look the df `df_swiss_population`

Change the name of a column with `rename` method (use `dict`) :

In [None]:
# Give a friendly name ("Population") to the population column of df_swiss_population
df_swiss_population =
df_swiss_population

## Manipulate dataframe

Then I want to merge the two dataframe :


> **_Note_** : This manipulation reset index, but it possible manually (after .loc p.ex.)by using : ``df_data = df_data.reset_index()``

In [None]:
# Join the data from the two table, only the values present in the two column !
## With function of pandas
df_swiss_data_inner_join = 
## With methode of the df (linked to variable/object)
df_inner_join_simpler = 

#### Other important part to manipulate

To merge two df, we need a key, this is a column name present on the two df, exactly the same !
- `on` = Define on wich **column** the dfs where merge (exactly same column's name need to be present on both), if parameter `on` is None this use the intersection of the two df
- `how` = Define on wich **row** the dfs where merge (find the same key)
  - `inner` = Default join, only save the key present in both df
  - `left` = Save all the key of **left**'s df, and complete with right information (when it's possible, be carefull, you will have some `NaN` value)
  - `outer` = Save all the key of both df, also have `NaN` value

If the structure between the df is **exactly** the same, you can use `concat` function :

In [None]:
## df_inner_join_simpler, df_swiss_data_inner_join
df_concat =
df_concat

To add information after, use the `append` method : `df1.append(df2)`

##### NaN (NaT,...) value

This value is a problem when you manipulate data.

It's possible to solve the problem with various method :
- Complete with constant value - `df=df.fillna(1)` 
  - Replace by the integer 1 if you have time series, that doesn't work, use `df["time"].fillna(pd.Timestamp("20221225"))`
- Remove value - `df.dropna()`
  - Remove row (parameter `axis = 0`) or column (parameter `axis = 1`) with `NaN`, by default `axis = o`
- Interpolate value - `df=df.interpolate()`
  - Fill with midpoint between the two cells

## Visualisation of the data

Now I would like to visualize the evolution of the electricity generation per capita over the time :
    "Electricity generation - TWh" / "Population" * 1e9

In [108]:
# The basic ways to do that

# Before need to create the empty column
nb_line = df_swiss_data_inner_join.shape[0]
df_swiss_data_inner_join["Production per capita - kWh"] = np.zeros(nb_line)
# Range because int is not iterable (create a sequence of number)
for i in range(df_swiss_data_inner_join.shape[0]):
    df_swiss_data_inner_join.loc[i, "Production per capita - kWh"] = (
        df_swiss_data_inner_join.loc[i, "Electricity generation - TWh"]
        / df_swiss_data_inner_join.loc[i, "Population"]
        * 1e9
    )

The second method *__and simply__* method to do that

In [83]:
df_swiss_data_inner_join["Production per capita 2e method - kWh"] =

#### Another ways to do that

The power of apply plus lambda function is his simplicity, we can apply a function (lambda) to multiple row in one line of code. (Small comparison of using or not [lambda](https://www.geeksforgeeks.org/python-lambda-anonymous-functions-filter-map-reduce/#:~:text=With%20lambda%20function,Without%20lambda%20function) function)



In [84]:
df_swiss_data_inner_join["Production per capita 3e method - kWh"] = (
    df_swiss_data_inner_join.apply(
        lambda row: row["Electricity generation - TWh"] / row["Population"] * 1e9,
        axis=1,
    )
)

It also possible to add a condition with an if function.
I would like to know if the production per capita of this year is upper than the global average.

In [109]:
mean_production_per_capita = df_swiss_data_inner_join[
    "Production per capita - kWh"
].mean()
df_swiss_data_inner_join["Per Capita Up than mean"] = df_swiss_data_inner_join.apply(
    lambda row: (
        "Yes"
        if row["Production per capita - kWh"] > mean_production_per_capita
        else "No"
    ),
    axis=1,
)

#### Compare the three methods

If we only multiply column, the second method is the simplest.

In [None]:
df_swiss_data_inner_join[
    [
        "Year",
        "Production per capita - kWh",
        "Production per capita 2e method - kWh",
        "Production per capita 3e method - kWh",
    ]
].loc[0:4]

#### Clean dataframe


Like we can see, the dataframe can have a lots informations inside, it's possible to remove certain column, or just print the part you want.
With ``drop`` method

In [86]:
# One way to remove unwanted column
df_swiss_data_clean = df_swiss_data_inner_join  ###

### Plot information

Now I would like to visualize the Population and the production per capita over time

In [None]:
# Create a subplot to add a secondary Y axis
subfig = make_subplots(specs=[[{"secondary_y": True}]])
# Create the line plot
# fig=px.line(df_swiss_data_clean, x="Year",y=["Population","Production per capita - kWh"], title="Production per capita and population evolution in Switzerland", labels={"Population":"Population","Year":"Year"})
fig = px.line(
    df_swiss_data_clean,
    x="Year",
    y="Production per capita - kWh",
    title="Production per capita and population evolution in Switzerland",
    labels={"Population": "Population", "Year": "Year"},
)
fig2 = px.line(
    df_swiss_data_clean,
    x="Year",
    y="Population",
    title="Production per capita and population evolution in Switzerland",
    labels={"Population": "Population", "Year": "Year"},
)
fig2.update_traces(yaxis="y2")
subfig.add_traces(fig.data + fig2.data)
# To format figure
subfig.update_layout(margin=dict(l=60, r=60, t=60, b=60), width=800, height=500)

subfig.layout.xaxis.title = "Time"
subfig.layout.yaxis.title = "Production per capita - kWh"
subfig.layout.yaxis2.title = "Population"
# Show plot
subfig.for_each_trace(
    lambda t: t.update(line=dict(color=t.marker.color))
)  # Change color of each line on the plot
subfig.show()

#### Chat GPT solution

In [None]:
import plotly.express as px
import plotly.graph_objects as go

# Assuming your dataframe is named df_swiss_data_inner_join
fig = px.line(
    df_swiss_data_inner_join,
    x="Year",
    y="Population",
    title="Population and Production per Capita Over Time",
)

# Add the secondary y-axis
fig.add_trace(
    go.Scatter(
        x=df_swiss_data_inner_join["Year"],
        y=df_swiss_data_inner_join["Production per capita - kWh"],
        mode="lines",
        name="Production per capita - kWh",
        yaxis="y2",
    )
)

# Update layout to add a second y-axis
fig.update_layout(
    yaxis2=dict(title="Production per capita - kWh", overlaying="y", side="right"),
    yaxis=dict(title="Population"),
    xaxis=dict(title="Year"),
    legend=dict(x=0.1, y=0.9),
    margin=dict(l=60, r=60, t=60, b=60),
    width=800,
    height=500,
)

# Display the plot
fig.show()

### Dictionnary of dataframe
Pandapower make dictionnary of dataframe, here an exemple of that:

In [None]:
df1 = pd.DataFrame({"Name": ["Shahroz", "Samad", "Usama"], "Age": [22, 35, 58]})

df2 = pd.DataFrame(
    {"Class": ["Chemistry", "Physics", "Biology"], "Students": [30, 35, 40]}
)

# list of data frames
dataframes = [df1, df2]

# dictionary to save data frames
frames = {}

for key, value in enumerate(dataframes):
    frames[key] = value  # assigning data frame from list to key in dictionary
    print("key: ", key)
    print(frames[key], "\n")

# access to one data frame by key
print("Accessing the dataframe against key 0 \n", end="")
print(frames[0])

# access to only column of specific data frame through dictionary
print("\nAccessing the first column of dataframe against key 1\n", end="")
print(frames[1]["Class"])

# Pandapower

In [90]:
import pandapower as pp

Tuto on _Exercice Power flow 1 : Problème 1_

We use the librarie pandapower : [doc_pandapower](https://pandapower.readthedocs.io/en/v2.9.0/about.html)

![Problem base](plot/Ex_PF1_Pb1_Info_p.png)

Initialize the network by creating an empty network. : ExPF1_Pb1

In [91]:
net_ExPF1_Pb1 = 

Create all the bus

In [92]:
# create buses
b1 = 
b2 = 
b3 = 
b4 = 
b5 = 

Create all the loads

In [93]:
# create load
load_2 = 
load_3 = 

Create all the generator and the slack bus

In [94]:
# create generator (external grid as generator to slack bus)
gen_1 = 
# pp.create_gen(net,bus=b1,p_mw=0,vm_pu=1,min_q_mvar=-100,max_q_mvar=100,name="Gen 1",slack=True)  # Generator on slack bus ??
gen_2 = 
gen_5 = 

Create custom transformer, pure inductor.

In [95]:
# create transformer
trafo = 

Calcul line parameter in correct units, transform from pu.

Define the base

In [None]:
S_base = 100e6
U_base = 400e3
Z_base = U_base**2 / S_base
Y_base = 1 / Z_base
Ligne_value = {
    "R_pu": [0.02, 0.02, 0.02, 0.04, 0.04],
    "X_pu": [0.2, 0.4, 0.4, 0.4, 0.4],
    "B_pu": [0.8, 0.8, 0.4, 0.4, 0.4],
    "Un": [400, 400, 400, 400, 400],
}
DF_Ligne = pd.DataFrame(Ligne_value, index=["A", "B", "C", "D", "E"])
Z_base

Calcul the parameters

In [None]:
rl_ohm_per_km = 
xl_ohm_per_km = 
cl_nF_per_km = 


Create the lines with parameters

In [98]:
# create line
line_a = 
line_b = 
line_c = 
line_d = 
line_e = 

Check line

Check problem

By default, we run a powerflow with NR :
- `nr` Newton-Raphson (pypower implementation with numba accelerations)
- `gs` gauss-seidel (pypower implementation)
- `fdbx` fast-decoupled (pypower implementation)

Check result