In [1]:
import os
import ast
import pandas as pd

### Grouping of the different amino acids that can be found in a protein


This code is designed to read amino acid data from a text file and print out the details of each amino acid, including its name, single-letter code, three-letter code, and SMILES structure. The following steps outline the functionality of the code:

1. **Import Required Libraries**: The code uses the `os` and `ast` libraries. The `os` library is used for file path manipulations, and the `ast` library is used to safely evaluate a string containing a Python literal.

2. **Determine the Path to the Data File**: 
    - The current directory of the notebook is obtained using `os.path.dirname(os.path.abspath('__file__'))`.
    - The project root directory is determined by navigating one level up from the current directory.
    - The full path to the `aminoacids.txt` file is constructed by joining the project root with the `data` folder and the filename.

3. **Read the Amino Acid Data from the File**:
    - The file `aminoacids.txt` is opened and its content is read into a string variable.
    - The data string is processed to extract the list part if it includes a variable assignment.


4. **Convert the String Data to a Python List**:
    - The `ast.literal_eval` function is used to safely evaluate the string containing the list of amino acids and convert it into a Python list of dictionaries.
  

5. **Print Amino Acid Details**:
    - The code iterates over the list of amino acids.
    - For each amino acid, it prints its name, single-letter code, three-letter code, and SMILES structure in a formatted manner.

This code is useful for processing and displaying biochemical data related to amino acids, making it easier to analyze and work with this information in a structured way.

In [2]:
# Determine the path to the aminoacids.txt file
current_dir = os.path.dirname(os.path.abspath('__file__'))
project_root = os.path.dirname(current_dir)
file_path = os.path.join(project_root, 'data', 'aminoacids.txt')

# Read the amino acids data from the file
with open(file_path, 'r') as file:
    data = file.read().strip()
    amino_acids = ast.literal_eval(data)

# Create a DataFrame
df = pd.DataFrame(amino_acids)

# Display the DataFrame
print("Amino Acid Data:")
display(df)

Amino Acid Data:


Unnamed: 0,name,single_letter,three_letter,smiles
0,Alanine,A,Ala,CC(=O)N
1,Arginine,R,Arg,NC(CCNC(=N)N)C(=N)N
2,Asparagine,N,Asn,CC(=O)N[C@@H](CCC(=O)O)C(N)=O
3,Aspartic acid,D,Asp,CC(C(=O)O)C(N)=O
4,Cysteine,C,Cys,C(C(=O)O)N
5,Glutamine,Q,Gln,CC(=O)NC[C@H](C(=O)O)N
6,Glutamic acid,E,Glu,C(CC(=O)O)C(C(=O)O)N
7,Glycine,G,Gly,C(C(=O)O)N
8,Histidine,H,His,C1=CNC=N1
9,Isoleucine,I,Ile,CC[C@H](C)C(C)C
