# Report

## 1. Character: Research Topic

The project explores fatty acid β-oxidation. It is a spiral pathway that breaks down long-chain fatty acids into acetyl-CoA, NADH and FADH₂. This process fuels ATP production. The project focuses on visualizing this process through a web-based app built with Streamlit.

Dietary lipids are first broken down by pancreatic and intestinal lipases into:
- Glycerol
- Fatty acids

These fatty acids are activated to fatty acyl-CoA and transported into the mitochondria.  
β-oxidation happens in the mitochondrial matrix, mainly in liver cells.  
Each cycle shortens the fatty acid by 2 carbon atoms.  
The main products are:
- Acetyl-CoA → enters the Krebs cycle
- NADH and FADH₂ → go to the electron transport chain

The breakdown continues until the entire fatty acid is consumed.  
This stepwise cleavage of carboxylic acids was discovered in 1904 by Georg Franz Knoop.  
The exact reactions were mapped out in 1951 by Feodor Lynen.


## 2. Setting: Niche

Beta-oxidation is commonly taught through lectures, textbooks, and static pathway charts, which often present the process in a fixed and abstract manner. While these resources cover the core concepts, they rarely offer interactive experiences that let students engage with the pathway dynamically.  

What’s missing is a visual, step-by-step tool that helps learners follow each reaction in real time and understand how molecular changes drive energy production.  

This gap defines a clear niche: **educational biochemical visualization tools** designed to make complex metabolic processes more accessible and intuitive.


## 3. Problem

Despite its importance in metabolism, β-oxidation is often perceived as abstract and difficult to grasp. Students frequently struggle with its spiral structure, where each cycle involves subtle molecular changes that are hard to visualize.  

It becomes even more challenging when trying to understand how variations in fatty acid length or saturation influence the number of cycles and ATP yield.  

Without a clear, dynamic representation, the pathway remains a complex concept, making it difficult for learners to connect reaction steps to energy production in a meaningful way.


## 4. Solution

Lynen’s Spiral is a web-based app built with Streamlit to support step-by-step exploration of fatty acid β-oxidation.

The app features three interactive pages:

### Home
Introduces key concepts such as:
- What is a fatty acid?
- Why are fatty acids important?
- The role of NADH and FADH, and ATP as the energy currency

Users can select or construct a fatty acid based on chain length and saturation.  
Both 2D and 3D structures are displayed.

### Lynen Spiral
A 3D spiral visualizes all the oxidation cycles.  
Each step (dehydrogenation, hydration, oxidation, and thiolysis) is marked along the spiral.  
The model is interactive and can be rotated.  
For any selected cycle and step, the product structure can be shown.

### Mechanisms
Displays ATP yield calculations, including total ATP, activation cost, FADH₂ yield and NADH yield.  
Users can view individual cycles or all cycles at once.  
Each reaction step includes reactants, products, and a brief description.

By combining structural models, pathway logic, and energy accounting, the app transforms β-oxidation into a visual and interactive learning experience.  
This helps learners connect molecular structure to reaction steps and energy output.


## 5. Materials and Methods

In this project, we combined several tools to construct an interactive visualization of the Lynen spiral and the process of β-oxidation. To make the entire experience accessible in a web browser, we used **Streamlit** as the front-end framework. A specific directory structure allowed us to organize the application into multiple pages (or tabs), each accessible through the sidebar.

On the home page, we relied on Streamlit’s components and visual elements, and we included static images of key molecules (stored in the `data` folder within the `Lynen_spiral_visualisation` directory).

For the fatty acid selection and visualization, we used **RDKit** to draw 2D molecular structures and **Plotly** to make interactive 3D representations. The Lynen spiral itself was generated using **NumPy** to define the geometry of the spiral, while each intermediate product in the β-oxidation cycle was represented using **SVG** images generated by RDKit’s 2D drawing capabilities.

To model the chemical transformations involved in β-oxidation, we used **SMARTS patterns**, which allowed us to define and apply recurring chemical reactions across different fatty acid inputs. This functionality was especially important for the *Mechanism* page, where each β-oxidation step is broken down in detail, and the associated ATP production is calculated and visualized.

Each of these tools and their integration into the application is described in more detail in the following sections, where we outline the main functions of each page.

---

### Fatty Acid / Enhanced Fatty Acid

#### Approach to Modeling Fatty Acid Metabolism

The project implements a computational model of fatty acid β-oxidation, the primary metabolic pathway for fatty acid degradation, using an object-oriented approach with two main classes:

- **`FattyAcidMetabolism` (base class):** Handles basic fatty acid properties and theoretical calculations. Additionally contains prototype methods to model β-oxidation.

- **`EnhancedFattyAcidMetabolism` (derived class):** Extends the base class with detailed reaction modeling using SMARTS patterns.



 #### Molecular Representation

Fatty acids and their intermediates are represented as RDKit molecule objects, created from SMILES (Simplified Molecular Input Line Entry System) notation. This representation enables:

- Accurate modeling of molecular structures  
- Precise tracking of atom positions and bond types  
- Identification of functional groups via SMARTS patterns

#### Reaction Modeling

Chemical reactions in the β-oxidation pathway are modeled using SMARTS reaction patterns, which define transformations between molecular structures. The implementation includes specific patterns for each step of the β-oxidation cycle, as defined by the `SMARTS_REACTIONS` dictionary. These patterns use atom mapping to track specific atoms through the transformation, ensuring chemical accuracy.

#### Simulation Logic Structure

The simulation follows a hierarchical structure that mirrors the biological process:

1. **Input Processing and Initialization**

   The `_process_input` method interprets input as SMILES strings and creates molecular representations. Basic molecular properties are calculated, including chain length and double bond positions.

2. **Activation and Transport (Preparatory Phase)**

   - Activation: The `activate_fatty_acid` method simulates the ATP-dependent activation of fatty acids to acyl-CoA derivatives.  
   - Transport: The `carnitine_shuttle` method conceptually represents the transport of acyl-CoA molecules into the mitochondrial matrix.

3. **β-Oxidation Cycle**

   The core simulation occurs in the `beta_oxidation_cycle` method, which executes four sequential steps, using the `run_reaction` method:

   - Dehydrogenation: Removal of hydrogen atoms to create a double bond, with special handling for pre-existing double bonds.  
   - Hydration: Addition of water across the double bond.  
   - Oxidation: Conversion of the hydroxyl group to a ketone.  
   - Thiolysis: Cleavage of the molecule to release a two-carbon acetyl-CoA unit.

   For fatty acids with existing double bonds, the system automatically detects their position using the `find_carbonyl_carbon`, `get_alpha_beta_gamma_carbons`, `has_alpha_beta_double_bond`, and `has_beta_gamma_double_bond` methods, and applies the appropriate isomerase-assisted dehydrogenation pathway.

4. **Complete Oxidation Control Flow**

   The `run_complete_oxidation` method orchestrates the entire process:

   1. Activate fatty acid → acyl-CoA  
   2. Transport acyl-CoA to mitochondria  
   3. WHILE chain_length > 2:  
      a. Execute β-oxidation cycle  
      b. Track intermediate products  
      c. Calculate ATP yields  
   4. Handle final products (acetyl-CoA or propionyl-CoA)  
   5. Return final results and summaries

   Each cycle removes two carbon atoms until the fatty acid is completely degraded.

5. **ATP Yield Calculation**

   ATP yield is calculated based on the cofactors generated during β-oxidation:

   - FADH₂ (from dehydrogenation): 1.5 ATP equivalents  
   - NADH (from oxidation): 2.5 ATP equivalents  
   - Acetyl-CoA: Tracked for further metabolism (not directly calculated in ATP)

   Special cases are handled for:

   - Odd-chain fatty acids (producing propionyl-CoA)  
   - Unsaturated fatty acids (requiring alternative pathways)

6. **Data Preparation for Visualization**

   The `prepare_data_for_visualization` method organizes the simulation results into a structured format suitable for rendering and analysis, including:

   - Step-by-step reaction information  
   - Molecular formulas and SMILES strings  
   - Cycle-specific details  
   - ATP yield statistics

#### Edge Case Handling

The implementation includes robust handling of several edge cases:

- Detection and special processing for unsaturated fatty acids  
- Distinct pathways for odd-chain fatty acids  
- Failsafe conditions to prevent infinite loops  
- Error handling for invalid molecular structures



---
### Home


#### Session Management and Data Flow

To ensure a coherent user experience, session state variables were initialized to track user progress and store molecular data across interactions. Upon startup, the app initializes:

- `welcome_page_seen`: a boolean indicating if the user has visited the introductory home page.  
- `fa_data`: a dictionary to store molecular data including the SMILES string, molecular object, delta-notation, and a processor instance for further metabolic simulation.

This state management system enables navigation between tabs of the application (Lynen Spiral and Mechanism) while ensuring data remains consistent throughout the user session.

#### Molecular Image Rendering (`mol_to_svg_image`)

To provide a 2D visualization of molecules, the `mol_to_svg_image()` function is used. It takes a molecular object as input and outputs an SVG image. Special post-processing is applied to clean XML tags to comply with Streamlit’s rendering requirements. This method ensures molecules are visualized in high resolution with clearly labeled atoms and bonds.

#### Structural Interpretation (`smiles_to_delta`)

To enrich chemical understanding, the `smiles_to_delta()` function converts a SMILES string into a delta-notation descriptor (e.g., 18:1(Δ9)), identifying the total carbon count and positions/configurations of double bonds. The function parses atom connectivity to identify the carboxylic acid carbon and traverses the carbon backbone, recording bond types and stereochemistry (cis/trans) when present. This output provides a concise yet chemically informative summary for user-displayed molecules.

#### 2D Structure Output (`display_2d_structure`)

This function retrieves the molecular object from session state. It displays:

- Canonical SMILES notation  
- Molecular formula (using `CalcMolFormula`)  
- Delta-notation (using `smiles_to_delta`)  
- A scalable SVG image (using `mol_to_svg_image`)

The output is formatted in a responsive layout using Streamlit's column API.

#### 3D Molecular Visualization (`display_3d_structure`)

For spatial understanding, `display_3d_structure()` generates a 3D plot of the molecule using RDKit’s 3D embedding (`EmbedMolecule`) and energy minimization (`MMFFOptimizeMolecule`). The molecule is rendered in an interactive Plotly figure, with atoms displayed as spheres and bonds as styled lines (e.g., double bonds rendered as parallel lines). Color and size mappings visually differentiate atom types (e.g., red for oxygen, grey for carbon), and stereochemistry is preserved.

#### Main Interaction Logic (`main_page`)

The central method, `main_page()`, defines the main flow of user interaction:

1. **Molecule Selection**  
   Users choose between a predefined fatty acid from a curated list or constructing a custom fatty acid. Selected SMILES strings and annotations are stored locally.

2. **Custom Molecule Builder**  
   Users specify chain length and input desired double bond positions. The `validate_double_bond_input()` method ensures:  
   - No duplicates or adjacent double bonds  
   - Chemically plausible locations  
   - Valid stereochemistry (cis/trans)

3. **SMILES Generation**  
   Based on user inputs, `build_fatty_acid_smiles()` programmatically assembles a custom SMILES string, accounting for geometric isomerism. The string begins with a carboxylic acid head and appends carbon atoms, inserting `/C=C\` or `/C=C/` as dictated by cis/trans preferences.

4. **Data Handling**  
   When the user presses "Visualize Fatty Acid," the SMILES is parsed via RDKit. The resulting molecule is stored in session state alongside metadata (`smiles`, `notation`, `name`, `mol`, `processor = EnhancedFattyAcid`), and the app navigates to the Visualizer tab.

5. **Navigation and Display**  
   The sidebar navigation allows users to toggle between the Home and Visualizer pages. If a valid molecule is loaded, `display_2d_structure()` and `display_3d_structure()` are called, showing both static and dynamic representations.

---
### Lynen Spiral 


This Streamlit app visualizes the **β-oxidation of a fatty acid** as a **3D Lynen spiral**, using color-coded reaction steps.

It leverages a custom class, `EnhancedFattyAcidMetabolism`, to simulate the biochemical pathway and display each reaction cycle interactively.  
The app checks for a molecule input (SMILES string) saved in `st.session_state`, runs the β-oxidation process, and generates a 3D spiral plot where each point represents a reaction step colored by type (e.g., oxidation, hydration).  

Users can interactively navigate cycles and steps, and view **2D molecular structures** for any intermediate along the spiral.

Those are the main functions used to make the visualisation of the 3D spiral : 

**`run_beta_oxidation(smiles)`**  
This function simulates the complete β-oxidation pathway for the input fatty acid represented by its SMILES string.  
It creates an instance of `EnhancedFattyAcidMetabolism`, runs the complete oxidation pathway using `.run_complete_oxidation()`, and returns a structured dictionary containing all reaction cycles and steps.


**`create_lynens_spiral(cycles)`**  
This function generates a 3D Plotly visualization of the β-oxidation steps arranged in a spiral (the "Lynen spiral").  
It iterates through each cycle and step to compute spiral coordinates `(x, y, z)`, and maps each reaction type to a color (from the `reaction_colors` dictionary).  
Finally, it labels each cycle, adds a custom legend for reaction types, and configures the Plotly scene for a clean 3D layout.  
This is the **main visual rendering function**.


**`mol_to_svg_image(mol, width=600, height=300)`**  
This function generates a 2D SVG image of a given molecule using RDKit’s drawing tools.  
It uses `MolDraw2DSVG` to draw the molecule and returns the SVG as a cleaned string.  
It is used to display the chemical structure of each reaction intermediate when a user selects a step in the sidebar.



---

### Mechanisms 


#### Languages and libraries
- **Python**  
- **Streamlit** provides the reactive web front-end  
- **RDKit** is responsible for cheminformatics  



#### `run_complete_oxidation()`

It loops until the acyl chain reaches ≤ 3 carbons, creating a dict per cycle containing:  
- **steps** (input SMILES, output SMILES, SMARTS)  
- **by-products** (FADH₂, NADH counts)  
- **chain_state** (current length, double-bond status)



#### `calculate_atp_yield()`

Transforms the accumulated redox equivalents and acetyl-CoA counts into ATP:  
- 1 FADH₂ = **1.5 ATP**  
- 1 NADH = **2.5 ATP**  
- 1 acetyl-CoA = **10 ATP**  
- Activation cost = **−2 ATP**  
- Odd-chain propionyl-CoA correction = **+15 ATP**



#### `_display_cycle_steps()`

It is a convenience routine that iterates over a cycle dict and constructs a two-column layout:  
- **left** → reactant/product images with a coloured “CoA” pseudo-atom label  
- **right** → explanatory markdown, LaTeX equations, SMARTS pattern and expected ATP delta.


## 6. Results
---

### Home

Upon launching the application, users are presented with a welcoming homepage that introduces key biochemical concepts—such as the biological roles of fatty acids, the process of β-oxidation, and the energy-carrying molecules involved (e.g., Acetyl-CoA, FADH₂, and NADH).

From the homepage, users navigate to the main interaction panel, where they choose between exploring predefined fatty acids (e.g., palmitic acid or oleic acid) or constructing a custom molecule. The selection interface employs dropdown menus and sliders, minimizing input errors while allowing flexibility. For custom fatty acid creation, interactive form fields prompt users to define chain length (adjustable via a slider) and specify double bond positions and their configuration (cis/trans). A validation system prevents chemically impossible configurations, such as double bonds at invalid carbon positions, ensuring users generate only feasible structures.

Once a molecule is selected or designed, users trigger visualization by clicking the **"Visualize Fatty Acid"** button. The application then transitions to a dedicated viewer, split into two tabbed panels: one displaying a 2D structural diagram with labeled annotations (molecular formula, systematic name) and another presenting an interactive 3D model. The 3D viewer allows rotation, zooming, enhancing spatial understanding of molecular geometry.

---

### Lynen Spiral

Once a fatty acid has been selected—either from the library of proposed molecules or through custom input—a tab on the left called **Lynen Spiral** displays a 3D visualization of the β-oxidation process. As previously explained, fatty acid breakdown involves a repeating sequence of biochemical reactions that continue until the entire molecule is degraded.

We chose to represent these reaction cycles using a 3D spiral to make the progression of steps visually intuitive. Each type of reaction is color-coded consistently across cycles, allowing users to easily identify recurring steps and their order within each cycle. The spiral can be rotated and zoomed in for better exploration and clarity.

To enhance interactivity, each cycle features a slider that lets users select individual steps. This reveals the molecular product at that point and illustrates how each mechanism progressively breaks down the fatty acid.

At the bottom of the page, two navigation buttons help guide the user. One returns to the home page, while the other opens the **Mechanisms** page. This detailed view provides a closer look at each individual step within the cycles, along with a calculation of ATP produced and consumed during fatty acid breakdown.

---

### Mechanisms

**ATP dashboard**  
Four widgets summarise total ATP, activation cost, and the individual FADH₂ and NADH contributions. A markdown block underneath provides the full arithmetic so students can verify the numbers manually.

**Cycle selector**  
Lists “Cycle 1 … Cycle n” plus “All Cycles”. On selection, `_display_cycle_steps()` is called either once (single-cycle view) or in a loop (full view).

**Beta Oxidation Steps**  
For each chosen cycle the page shows a two-column tableau. The left column depicts the reactant and product. The right column shows the biochemical step name (e.g., Dehydrogenation, Hydration, Oxidation, Thiolysis), a markdown explaining the chemistry, a LaTeX-rendered equation, the SMARTS pattern, the ATP attributable to that step and the exact SMILES transformation printed as computer code.

**Product summary and navigation**  
A final two-column report states the acetyl-CoA count and flags the presence/absence of propionyl-CoA. Two widgets route the user back to the **Lynen Spiral** animation page or the **Home** page.


---
## 7. Discussion

We successfully created a visualization tool using cheminformatics-specific methods, particularly SMARTS patterns and RDKit. At the beginnig, we were uncertain whether we could effectively represent both the spiral and the full β-oxidation mechanism on a single webpage. Ultimately, we managed to achieve this using Streamlit.

One of the main challenges we encountered was navigating between different pages in Streamlit, as the platform isn’t optimized for multi-page navigation. This is an area for improvement; in the future, it may be worth considering alternative frameworks that offer smoother transitions and more advanced web functionalities. We can perhaps think about using Dash which has Built-in support for multi-page apps with routing or Flask which has a customizable navigation. 

The key strengths of our project lie in the comprehensive visualization of the β-oxidation pathway and the interactive navigation across different views. Additionally, we wanted to make the tool educational by including content to help users understand what fatty acids are and why β-oxidation is biologically important.

However, there are some limitations. For example, the tool only allows limited customization of input molecules. It is possible to modify only the chain length and the number of double bonds, since β-oxidation operates strictly on fatty acids. An interesting future improvements would be to incorporate a larger database of known fatty acids, along with a parsing system that could match user inputs to known molecules and return their names.

Further improvements could also include more interactive elements, such as quizzes or guided feedback, along with deeper biological explanations. Finally, this project serves as a strong foundation for a more ambitious vision: a comprehensive platform for visualizing all major metabolic pathways in the human body, such as the citric acid cycle, glycolysis, and beyond.