# Day 12 In-class Assignment

---


### <p style="text-align: right;"> &#9989; Put your name here.</p>

#### <p style="text-align: right;"> &#9989; Put your group member names here.</p>

## Is it safe to consume hemp-fed cows? How much beef do you need to eat to get "high"?

<img src="https://media.licdn.com/dms/image/v2/D5610AQEpshnp9M3QSA/image-shrink_1280/image-shrink_1280/0/1687785894642?e=2147483647&v=beta&t=9QZgnbeDINQToztUzQE4i3_U3mdMQdMitcSZH5iHkj0" style="display:block; margin-left: auto; margin-right: auto; width: 55%" alt="Two images. One depicts cannabis leaves. The other one shows a cow staring at the camera.">
<p style="font-size:0.85em; text-align: center;">Credits: <a href="https://www.abc.net.au/news/2023-06-25/farmers-feed-livestock-hemp-based-pellets/102426246" target="_blank">ABC Australia</a></p>

### Learning goals for today's assignment

* Use Pandas to filter data to select particular subsets of interest
* Articulate, based on your own perception, what you thinks makes a data visualization "good" versus "bad"
* Use data to support a claim or make an argument

### Assignment instructions

Work with your group to complete this assignment. Instructions for submitting this assignment are at the end of the Notebook. The assignment is due at the end of class.


---
# Background and motivation

We will keep exploring the dataset on cannabinoids concentration in hemp-fed cattle. In the last in-class asignment, we showed that indeed some &Delta;9-THC (the main psychoactive component of cannabis) concentrated in the adipose tissue of these cattle. This THC *could* then go into the body of whomever eats the cattle, and the final consumer *could* suffer from unintended THC exposure.

Is the THC in the **cattle fat (adipose tissue)** enough to have adverse effects on the final consumer?

### Ground rules to determine what is safe

According to the [European Food Safety Authority (EFSA)](https://doi.org/10.2903/j.efsa.2015.4141), the allowable daily exposure rates are:
+ 1 &micro;g/kg for &Delta;9-THC
+ 150 &micro;g/kg for CBD

Notice that the rate depends on the subject body mass (in kg). People will be exposed to THC by consuming fat.

+ From the [CDC](https://pubmed.ncbi.nlm.nih.gov/33541517/), we know the average body mass in the US.
+ From the [EPA](https://www.epa.gov/expobox/exposure-factors-handbook-chapter-11), we know the 90th percentile of animal fat consumption

Combining and separating the results based on sex and age, we have:

| Age y   |   BW kg (M) |   Fat intake g/d (M) |   BW kg (F) |   Fat intake g/d (F) |
|:--------|------------:|---------------------:|------------:|---------------------:|
| **Newborn** |        3.41 |                   79 |        3.29 |                   69 |
| **0.5**     |        8.41 |                   79 |        7.66 |                   69 |
| **1**       |       10.59 |                  125 |        9.79 |                   89 |
| **1.5**     |       12.07 |                  125 |       11.11 |                   89 |
| **2â€“3**     |       13.14 |                  121 |       12.48 |                  109 |

You can read more in the original paper:

> Fritz, B.R., Kleinhenz, M.D., Magnin, G. *et al.* (2025) [Tissue residue depletion of cannabinoids in cattle administered industrial hemp inflorescence.](https://doi.org/10.1038/s41598-025-26448-5) *Scientific Reports* **15**(42337)

## Getting started

We start by importing the usual libraries and loading the datasets with Pandas
- `concentration_raw.csv` is the actual raw data from Fritz et al (2025).
- `safety_limits.csv` is the full table of age, body mass, and fat intake.

In [1]:
import matplotlib.pyplot as plt

import numpy as np
import pandas as pd

# Loading the data
data = pd.read_csv('concentration_raw.csv')
safety = pd.read_csv('safety_limits.csv', index_col=0)

&#9989;&nbsp; **Task 1**

Double-check that your data loaded correctly by displaying the first first lines.

In [2]:
# Put your code here


#### Description of data fields in this data set:

- *Tissue*: The tissue on which cannabinoid concentration was measured from
- *ID*: unique identifier for the specific cattle head
- *Time (d)*: days since the last cannabis dosage
- Onward columns: concentration of the indicated cannabinoid in ng/g

---

## Part 1. Analysis Using Descriptive Statistics

&#9989;&nbsp; **Task 2**

Use `.describe()` function to determine the mean, standard deviation, min, median, and max of the &Delta;9-THC concentration. 

In [3]:
# Write your code here


**&#9989;&nbsp; Question 3**

From this information alone, can you determine if the 9-THC content *in the cattle fat (adipose tissue)* is below the allowable threshold?

<font size="+3" color="green">&#9998;</font> *Write your answer here*

**&#9989;&nbsp; Question 4**

- Does it make sense to look at the average 9-THC content *across all data points (rows)* in the first place?
- Should we just better look at a subset of rows?

<font size="+3" color="green">&#9998;</font> *Write your answer here*

---

## 2. Masking and refining the dataset

The average of the whole dataset makes little sense: we are mixing tissues and times. The averages would make much more sense if they were for each *tissue* and each *timepoint*.

&#9989;&nbsp; **Task 5**

- Mask the dataset above so that we are only looking at *adipose tissue*
- We do not want to overwrite the original dataset (with all the data), so store the masked dataframe in a different variable. 
- Visualize the first few rows of this masked dataframe

In [4]:
# Write your code here


&#9989;&nbsp; **Task 6**

Now notice that there are several columns that have exclusively NaNs (not a number). Like in the previous class, it makes little sense to analyze cannabinoids that are inexistent. 

- Use `.dropna` to drop the columns that have *all* NaNs
   - That is: drop columns like `THCA` or `CBDVA`
   - Do not drop `CBLA`: there are a couple of cattle that did report some concentration.
- Make sure to save and visualize the dataframe

In [5]:
# Write your code here


In [6]:
tconc = tconc.dropna(axis=1, how='all')
tconc.head()

NameError: name 'tconc' is not defined

&#9989;&nbsp; **Task 7**

Not every NaN was dropped. The NaNs in the "CBLA" column indicate that there was no cannabinoid detected just for *some* samples. Now it would make sense to have a zero instead of NaNs, because zero is indeed a number. Fortunately, Pandas has an easy way to do just that

- Use `.fillna` to replace the "NaN"s with zeroes. Use either the `?` command or duckduckgo the documentation to check its exact usage
- As always, remember to save and visualize your edited dataframe

**Note**: On whether NaNs should be replaced by zeroes or dropped entirely, it is **a matter of context**. In this case, we replace them with zeroes because we *know* that they indicate *undetected* as opposed to *data was lost/not run*.

In [7]:
# Write your code here


&#9989;&nbsp; **Task 8**

That is a clean dataset. We still have to separate entries by days after last dosage.

- With the variable day below, mask your dataset so that it only has entries related to that day in particular
- Store the masked dataframe in a new variable and visualize it.


In [8]:
# Mask and make a NEW dataframe
day = 1

In [9]:
day = 1
df = tconc[tconc['Time (d)'] == day]
df

NameError: name 'tconc' is not defined

---

## 3. Summarizing our results: focus on &nbsp;&Delta;9-THC

Now we finally have only the entries corresponding to adipose tissue one day after the last THC dosage. Going back to the original motivation, let's focus from hereafter on &Delta;9-THC concentrations.

Our final goal is to fill in this dataframe:

| Day | Mean | SD | Min | Median | Max | 
| :----- | :----- | :------ | :----- | :----- | :----- |
| **1** | ??| ?? | ??  | ?? | ?? |
| **2** | ??| ?? | ??  | ?? | ?? |
| **3** | ??| ?? | ??  | ?? | ?? |
| **5** | ??| ?? | ??  | ?? | ?? |
| **8** | ??| ?? | ??  | ?? | ?? |


&#9989;&nbsp; **Task 9**

Below you have an empty summary dataframe, with its rows corresponding to days after last THC dosage, and the statistical description of &Delta;9-THC concentration in the adipose tissue.

Similar to the pre-class, fill in the dataframe with the correct numbers. 

*Hint*: You might want to loop with a `day` variable like in Task 8. Make sure you are using `.loc`.

*Hint*: Remember that you can use functions `.mean()`, `.std()`, `.min()`, `.median()`, and `.max()`.

In [10]:
# Finish the code here
cannabinoid = '9-THC'
days = [1,2,3,5,8]
summary = pd.DataFrame(0., index=days, columns=['Mean', 'SD', 'Min', 'Median', 'Max'])


---

## 4. Bringing it all together: is it safe to consume?

Now you have a summary of &Delta;9-THC concentrations found in beef fat [in ng/g]. From the `safety` dataframe, we have estimates on how much animal fat a US person consumes [in g/d]. Thus the daily intake 9-THC intake for a person can be estimated as:

$$\text{Daily 9-THC intake} = \text{Fat intake} \times \text{9-THC concentration}/1000.$$

The division by 1000 is to make sure we get &micro;g/d units for the daily intake (remember that 1000 ng = 1 &micro;g). 

&#9989;&nbsp; **Task 10**

- Choose the largest *mean* concentration value out of all the values you summarized in Task 9? Which day reported the highest concentration?
- Choose a gender (`'M'` vs `'F'`)
- Determine how much 9-THC a person would be consuming based on their age. 

In [11]:
# Determine 9-THC intake

&#9989;&nbsp; **Task 11**

Remember that the ESFA allows the intake of 9-THC if it is less that 1&micro;g *per kg of bodyweight*. Use the body weight information from `safety` and the intake information from Task 10.

- Are there any age groups at risk of exceeding ESFA limits on 9-THC?

In [12]:
# Your code here


--- 

## 5. Conclusions

&#9989;&nbsp; **Question 12**

- Based on your analysis, would you feel comfortable consuming hemp-fed cattle? 

<font size="+3" color="green">&#9998;</font> *Write your answer here*

&#9989;&nbsp; **Question 13**

Remember that for Task 10, we set 9-THC concentration as the largest *mean* value.
- Do you think this is a reasonable choice?
- What if we had considered the largest *maximum* value instead just to be safe? But the maximum depends on a single cattle, while the *mean* is a composite of all cattles.

Remember that this is a matter of food safety and public health.

<font size="+3" color="green">&#9998;</font> *Write your answer here*

---

### Assignment wrap-up

Please fill out form from the link below. You must log-in using your MU credentials. **You must completely fill this out in order to receive credit for the assignment!** 

#### https://forms.office.com/r/cADesBUd7V

In [13]:
# Click on the link above if this cell fails to produce a survey form.

from IPython.display import HTML
HTML(
"""
<iframe 
	src="https://forms.office.com/r/cADesBUd7V" 
	width="800px" 
	height="600px" 
	frameborder="0" 
	marginheight="0" 
	marginwidth="0">
	Click the link above if this cell fails to produce a survey
</iframe>
"""
)

---

### Congratulations, you're done!

Submit this assignment by uploading it to the course Canvas web page.  Go to the "In-class assignments" folder, find the appropriate submission folder link, and upload it there.

See you in class!

&#169; Copyright 2026,  Division of Plant Science & Technology&mdash;University of Missouri