---

## !!! READ THIS BEFORE RUNNING THE CODE !!!

---

#### Prerequisites

1. Install VSCode.
2. Go to Extensions in VSCode and install Jupyter Notebook and Python (newest versions).
3. Now go to the menu header "Terminal" and open up a new Terminal.
4. Run the following commands in the terminal:
  - pip install --upgrade pandas
  - pip install --upgrade numpy
  - pip install --upgrade matplotlib

---

#### Files

- "runnable.ipynb" contains all the runnable code.
- "config" contains all config files.
- "sample" is where the sample is stored.
- "plots" is where the plots are saved to after running the code.

---

#### Important

- Always run the codeblocks in the right order (first to last), alternatively you can use "Run All".

- When making changes to the code or plotting, read the code comments carefully before proceeding.

---

#### Description

1. The first codeblock is used solely for data import and is necessary for the rest of the codeblocks.

2. The second codeblock is used to create a config file **placeholder** in .csv format. 
To make use of this placeholder file, convert it into .xlsx format (using Excel) and
save it from "parameter_hier_einstellen.csv" to "eingestellte_parameter.csv".
Now you can set the parameters (columns) for each plot (rows) by inserting a "1"
to activate the config setting at the desired place.

3. The third codeblock is used to process the given data and create a scatter plot based on data and config settings.

---


In [None]:
# These imports are necessary to run library functions
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Gets the information from the Sample Excel Worksheet and saves it in "df"
df = pd.read_excel('sample/Sampling KIT-2.xlsx', sheet_name='Kompakt')

# Formats "df" in a way that the information inside the worksheet columns can be story in an array -> "data"
data = []
for columnX in df.columns:
  for columnY in df.columns:
    data.append([columnX, columnY])

# Create missing directories if they don't already exist
try:
  os.mkdir("config")
except OSError as error:
  print(error)
try:
  os.mkdir("plots")
except OSError as error:
  print(error)

In [None]:
# Creates a data frame from the given data
createConfigPlaceholder = pd.DataFrame(data)

# Saves data in .csv format
createConfigPlaceholder.to_csv('config/parameter_hier_einstellen.csv')

In [None]:
# Gets data from config excel sheet "eingestellte_parameter.xlsx"
config = pd.read_excel('config/eingestellte_parameter.xlsx')

# Defines colors with key names and saves them in "color" array
colors = {'M1':'#fff200', 'M2':'#ff9d00', 'M3':'#ff0000', 'MS':'#008300', 'F':'#4274ff', 'FZ':'#000000', 'MH':'#8f8f8f'}

# Gets the color tag from excel sheet for colors key array
c = df['Rock type']

# For loop for all plots + an index variable "idx" that increments with each loop for config reading purposes
for idx, plotData in enumerate(data):

  # Get configuration of excel sheet for each plot (is it relevant, should x-axis/y-axis be log and/or should it be inverted)
  notRel = config.at[idx, 'notRel']
  isXLog = config.at[idx, 'isXLog']
  isYLog = config.at[idx, 'isYLog']
  isYInv = config.at[idx, 'isYInv']
  notTrend = config.at[idx, 'notTrend']

  # Skips loop iteration if "not relevant" config is set to 1
  if notRel == 1 or notRel == "1":
    continue

  # Sets data for the x-axis and y-axis
  x = list(df[plotData[0]])
  y = list(df[plotData[1]])

  # Sets plot width and height in inches
  plt.figure(figsize=(10,10))

  # Sets plot style
  plt.style.use('seaborn')

  # Sets the label names for each axis
  plt.xlabel(plotData[0])
  plt.ylabel(plotData[1])

  # Creates normal scatter plot
  scatter = plt.scatter(x,y,s=100,edgecolors="black",c=c.map(colors))

  # Sets x-axis scale to logarithmic if "is y-axis logarithmic" config is set to 1
  if isXLog == 1 or isXLog == "1":
    plt.xscale('log')
  
  # Sets y-axis scale to logarithmic if "is y-axis logarithmic" config is set to 1
  if isYLog == 1 or isYLog == "1":
    plt.yscale('log')

  # Inverts y-axis if "is y-axis inverted" config is set to 1
  if isYInv == 1 or isYInv == "1":
    scatter.axes.invert_yaxis()

  # Calculates and sets trendline of scatter plot if "not trend" config is set to 1
  # The given x and y values have to be of the same type and not empty! (float -> float, int -> int, string -> string)
  # This code is not working due to missing data in the given sample and needs to be evaluated
  # One approach to fix this would be to remove the rows from each dataset that have at least one missing value
  """
  if notTrend == 1  or notTrend == "1":
    trendCompatible = True

    for i, j in zip(x, y):
      if type(i) != type(j) or np.isnan(i) or np.isnan(j):
        trendCompatible = False
        break

    if trendCompatible == True:
      z = np.polyfit(x, y, 1)
      p = np.poly1d(z)
      plt.plot(x, p(x), "r--")
  """

  # Changes the label names for file saving compability and saves plot as png file
  xlabel = plotData[0].replace('/', '_')
  ylabel = plotData[1].replace('/', '_')
  plt.savefig("plots/" + xlabel + " X " + ylabel + ".png", format = 'png'); # semicolon used to discard console output

  # Closes current figure
  plt.close()