<img style="float: right;" width = 450 src="ternary_uk.png"> 

# Ternary Soil Plots

### Generating Soil Texture Ternary Plots with Python



A soil texture ternary plot is a graphical representation of the relative proportions of sand, silt, and clay in a soil sample. These three components are considered the main indicators of soil texture and are used to classify soils into different texture classes. 

The plot is arranged in the shape of a triangle, with the three corners representing sand, silt, and clay, respectively. Each point within the triangle represents a unique combination of the three components, and the relative position of a point within the triangle indicates the relative proportions of the components in the soil sample. This type of plot is often used in soil science to quickly and easily visualize the texture of a soil sample.

The aim of this data analysis session is to plot a soil texture ternary plot of the data you have collected in the workshop to provide a visual comparison of your samples.


You should have already entered your group's data into the spreadsheet at: <https://docs.google.com/spreadsheets/d/1A1RJ-sAi5mYouKR--rf63nZjF5GpuFRb/edit>

**Please make sure that:**

* All of your percentage values are integers (whole numbers)
* You have not added a percentage sign after the number
* Your values for sand + silt + clay add up to 100 (n.b this does not include the value for organic matter)
* You have not left any other comments or formatting on the spreadsheet


### 1.1 Download the class soil structure data

Run the code cell below to download the soil data spreadsheet to your Noteable folder. This code uses a Python module called `requests` to download the spreadsheet file as a csv and to save it in the same location as this Jupyter notebook. 

In [None]:
#the code in this cell downloads the excel file containing the soil data for the class
#rerunning this cell will download the latest version of the sheet

import requests
import pandas as pd

# Downloading URL of the Google Sheet
url = "https://docs.google.com/spreadsheets/d/1A1RJ-sAi5mYouKR--rf63nZjF5GpuFRb/export?format=csv"

# Output file name
output_file = "soil_data.csv"

# Download the Google Sheet
response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Write the content to a CSV file
    with open(output_file, 'wb') as file:
        file.write(response.content)
    print(f"Google Sheet downloaded and saved as {output_file}.")
else:
    print(f"Failed to download the Google Sheet. HTTP Status Code: {response.status_code}")


### 1.2 Check the downloaded data

You previously used the pandas function `read_csv()` to read data into a dataframe during the workshops in Variation 1. 

Write Python code in the cell below to load the data in soil_data.csv into a dataframe called `soil_df`, then print the contents of the dataframe. 

*Hint: you could also try using the function display(df) - that outputs a cleaner style than print(df).* 

In [None]:
#Write your code here

### 1.3 Identify the types of variable your data contains

In Variation 1, you learned that there are two main types of variable, **categorical variables** and **numerical variables**.

Categorical variables can be either **nominal** and **ordinal**, and numerical variables can be either continuous or discrete. 

If you are unsure of these definitions, check Self Study Notebook 2.4 from Variation 1.

Identify the types of variables contained in the following columns and write them in the text box below.


### 1.4 Plot own group's data as ternary plot

The graphing libraries that we usually use (e.g. seaborn or matplotlib) do not have a specific function that enables us to easily plot a soil texture ternary plot. Here we will instead import a custom module that has been designed for the purpose (the source is available on Github at https://github.com/mishagrol/SoilTriangle).

Edit the code block below to insert your group name in the line `df = df.query('group == "PeatyBlinders"')`. Run the code block to produce the graph. Try commenting out the line you previously edited by adding a `#` symbol at the start of the line, then rerun the code. What does this do?


In [None]:
#This code imports the libraries that allow Python to draw a Triangle Plot

from trianglegraphgc import SoilTrianglePlot
import matplotlib.cm as cm
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
#set the size of the plot
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams.update({'font.size': 22})

#This graphing module only takes data in the form of a csv
#This code block opens the excel file and saves it as a csv 
#It also uses a method called .query() to select only data from you group

df=pd.read_csv('soil_data.csv')

df['site']=df['group']+' '+df['box_label'].map(str)
df = df.query('group == "Peaty Blinders "') ## <---- EDIT THIS LINE
display(df)
df.to_csv('soil_data_own_group.csv', index=False)

#generate the plot
tp = SoilTrianglePlot('Ternary Plot of Soil Structure for Modelled Soils')
tp.soil_categories(country='Britain')
tp.scatter_from_csv('soil_data_own_group.csv', hue='om', tags='site')
#tp.colorbar('Organic matter (%)')
tp.show('triangleplot_british')

#Save the plot as a JPEG file (you could also save as a png or PDF)
plt.savefig('my_ternary_plot.jpeg')

# Plot bar chart of own group data only

In Variation 1, you used Seaborn to plot histograms. The code in the cell below uses Pandas to draw a stacked bar graph of the different soil conditions your group has modelled. It reads in data from the csv that you created in the previous code cell (called `soil_data_own_group.csv`).

1. Run the code to see the result. 
2. The plot does not have a y axis label: edit the code to add an appropriate y axis label then rerun the code. 

We will explore different types of plot in the lecture in week 2. 

In [None]:
# Bar charts

# importing package
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
 
# create data
df = pd.read_csv("soil_data_own_group.csv")
# view data
display(df)
 
# plot data in stack manner of bar type
df[['sand', 'silt', 'clay', 'om', 'soiltype']].plot(x='soiltype', kind='bar', stacked=True, title='Composition of modelled soil types');
plt.xlabel("Soil Type")
plt.show()

### How does your data compare to other groups?

In this comparison, we want to see how the data from different groups compares. To be able to compare different groups' data, we need to plot comparable data on the same plot. In this case, it makes sense to draw a different plot for each box label in the dataset. 

We can't know in advance what soil types groups will have entered, so we can read this directly from the data. In the code below, `df.box_label.unique()` returns a list of all the unique box_label ids in the group dataset. The code uses a loop to loop through each of those box labels. Since the graph plotting code is in in the repeated block of that loop, the code draws a different graph for each box label. 

We will explore the use of loops to plot graphs further in the lecture in week 2. 

In [None]:
# Use single spreadsheet for groups
#Loops through different box labels and plot data from all groups that have studied that soil type
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
 
# read data into a dataframe of all groups
df = pd.read_csv("soil_data.csv")
 
for box in df.box_label.unique():
    print(box) #Print label for box
    df.query('box_label == @box')[['sand', 'silt', 'clay', 'om', 'soiltype', 'group']].plot(x='group', kind='bar', stacked=True)
    plt.xlabel("Group")
    plt.ylabel("Count")
    plt.title(f"Soil composition for boxes labelled {box}")
    plt.legend(bbox_to_anchor=(1.0, 1.0))
    plt.show()

**Group discussion questions**

Once you have plotted all groups' data, discuss the following questions as a group:
 * Did you data agree with all groups? What variation did you see, and what do you think caused that variation?
 * What did each chart allow you to tell about your data? What other types of chart could you use?
 
 

