### 📝 Learning goals of practical

- You can differentiate between marker assistant breeding and genomic selection

- You can describe the interaction between computaional methods and plant crosses

- You can interpret manhattan plots and breeding schemes, and list their goals


In [4]:
import sys
if "google.colab" in sys.modules:
    %pip install git+https://github.com/CropXR/EduXR.git
else:
    %load_ext autoreload
    %autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [2]:
from dsplantbreeding.Population import get_agricultural_population, get_natural_population, get_resilient_population
from dsplantbreeding.actions import perform_cross_between

2025-09-01 15:33:13.064372: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


## General introduction
We want to breed plants that are salt resistant, but also have a high yield. To study what makes plants salt-resistant we first investigate a large genetically diverse population. 

In [3]:
my_population = get_natural_population(n_plants=10)
my_population.show_size()
my_population.head()

Population has 10 plants and 50 markers.
   Marker_0  Marker_1  Marker_2  Marker_3  Marker_4  Marker_5  Marker_6  \
0         1         1         1         0         1         1         1   
1         0         0         1         0         1         0         0   
2         0         1         1         1         1         0         1   
3         0         1         1         0         0         0         1   
4         1         1         1         0         0         0         1   

   Marker_7  Marker_8  Marker_9  ...  Marker_42  Marker_43  Marker_44  \
0         0         0         0  ...          0          1          1   
1         1         1         0  ...          0          0          0   
2         1         0         1  ...          1          1          1   
3         0         1         0  ...          1          1          0   
4         0         1         0  ...          1          0          1   

   Marker_45  Marker_46  Marker_47  Marker_48  Marker_49  \
0        

### ❓Question

- What do the rows and columns in the data represent?

Now we perform a GWAS to predict our phenotype based on the markers.

In [None]:
my_population.show_manhattan_plot(to_phenotype='Salt Resistance (% survival)')

### ❓Questions

* What does this plot show?

* Why does the population have to be genetically diverse?

* What do you think happens to the plot as the size of the population increases?

* What marker do you think is most promising? Plot its association to the phenotype by filling in the marker index below.


In [None]:
my_population.show_marker_to_phenotype_relation(marker_location=FILL_IN_MARKER_INDEX, to_phenotype='Salt Resistance (% survival)')

### ❓Questions

* Does the p-value for this marker make sense intuitively?

* What do you think happens to the Manhattan plot as the size of the population increases?


Let's try increasing the population. Does the new Manhatten plot match your expectation?

In [None]:
my_population = get_natural_population(n_plants=100)
my_population.show_manhattan_plot(to_phenotype='Salt Resistance (% survival)')

Now let's introgress this marker into a high-yielding agricultural plant that farmers already use, but which does not have the desired marker.

To do this, we cross this plant with a resilient plant containing the desired marker.

In [None]:
resilient_population = get_resilient_population()
resilient_population.show_size()
# Now check if it indeed contains the marker at your location
resilient_population.show_marker_at_location(FILL_IN_INDEX)

In [None]:
agricultural_population = get_agricultural_population()
agricultural_population.show_marker_at_location(FILL_IN_INDEX)

Let's check how the two plants differ in phenotype:

In [None]:
# check difference in phenotypes
resilient_population.show_all_phenotypes()
agricultural_population.show_all_phenotypes()

Start performing the marker assisted backcrossing.

In [None]:
new_population = perform_cross_between(resilient_population, agricultural_population, n_offspring=10)
selected_population = new_population.select_plants_with_marker_at_location(FILL_IN_INDEX, desired_allele=1)
selected_population.n_plants

### 📝 Fill in
 - This selected offspring should be crossed with what other population? Fill that in below!

In [None]:
backcross_1 = perform_cross_between(selected_population, SELECT_POPULATION_TO_CROSS, n_offspring=10)
selected_back1_population = backcross_1.select_plants_with_marker_at_location(FILL_IN_INDEX, desired_allele=1)

backcross_2 = perform_cross_between(selected_back1_population, SELECT_POPULATION_TO_CROSS, n_offspring=10)
selected_back2_population = backcross_2.select_plants_with_marker_at_location(FILL_IN_INDEX, desired_allele=1)

selected_back2_population.show_all_phenotypes()

### ❓Questions


* What is the advantage of selecting each population based on markers, rather than phenotype?

* Why do we cross the offspring with the population you filled in above?

* How can you see if this breeding was successful? Was it succesful in this case?

* What do you expect to happen when we perform breeding without selecting for the markers?

In [None]:
new_population = perform_cross_between(resilient_population, agricultural_population, n_offspring=10)
backcross_1 = perform_cross_between(new_population, agricultural_population, n_offspring=10)
backcross_2 = perform_cross_between(backcross_1, agricultural_population, n_offspring=10)
backcross_2.show_all_phenotypes()

How would you apply MAS for predicting yield?


In [None]:
my_population.show_manhattan_plot(to_phenotype='Yield (Kg/Ha)')

### ❓Questions
* How is this different from the Manhattan plot you saw earlier? 

* What would this mean for MAS?



## Back to lecture

If you have extra time, feel free to attempt using MAS to breed for maximum yield!