<a href="https://colab.research.google.com/github/Takumi-Oshiro/Hierarchical-clustering-for-Crystallization-/blob/main/SASA_calc.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Task
Upload a PDB file, calculate the Solvent Accessible Surface Area (SASA) for each residue using freesasa, save the results to a CSV file named "sasa_per_residue.csv", and provide a download link for the CSV file.

## 必要なパッケージのインストール
Biopythonとfreesasaをインストールします。


In [None]:
import sys
# 必要なパッケージをインストールします。
# biopython: PDBファイルの読み込みと構造操作に使用
# freesasa: SASA計算に使用
# ipywidgets, nglview: 3D構造の対話的な可視化に使用
!{sys.executable} -m pip install biopython freesasa ipywidgets nglview

Collecting biopython
  Using cached biopython-1.85-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Collecting freesasa
  Using cached freesasa-2.2.1.tar.gz (270 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting nglview
  Using cached nglview-3.1.4.tar.gz (21.9 MB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting ipywidgets
  Using cached ipywidgets-8.1.7-py3-none-any.whl.metadata (2.4 kB)
Collecting notebook>=7 (from nglview)
  Using cached notebook-7.4.4-py3-none-any.whl.metadata (10 kB)
Collecting jupyterlab>=3 (from nglview)
  Using cached jupyterlab-4.4.5-py3-none-any.whl.metadata (16 kB)
Collecting comm>=0.1.3 (from ipywidgets)
  Using cached comm-0.2.2-py3-none-any.whl.metadata (3.7 kB)
Collecting widgetsnbextension~=4.0.14 (from ipywidgets)
  Using 

## Pdbファイルのアップロード
水分子とか必要のないchainはあらかじめ削除しておいてください




In [None]:
from google.colab import files

# ファイルアップロードダイアログを表示します。
# アップロードが完了すると、アップロードされたファイルの情報が 'uploaded' 辞書に格納されます。
uploaded = files.upload()

# ユーザーがアップロードした最初のPDBファイル名を取得します。
# 後続の処理でファイル名が必要になるため、変数に格納しておきます。
pdb_filename = list(uploaded.keys())[0]
print(f"Uploaded file: {pdb_filename}")

Saving test.pdb to test.pdb
Uploaded file: test.pdb


## SASAの計算と結果のcsv出力
freesasaを使用してアップロードされたPDBファイルの各残基のSASAを計算し、結果をCSVファイルに書き出します。


**Reasoning**:
Calculate the SASA for each residue in the uploaded PDB file using freesasa and write the results to a CSV file.



In [None]:
import freesasa
import pandas as pd
import tempfile
import os

# Get the name and content of the uploaded file
pdb_filename = list(uploaded.keys())[0]
pdb_content = uploaded[pdb_filename].decode('utf-8')

# Initialize temporary file path to None
tmp_pdb_path = None

try:
    # Save the PDB content to a temporary file
    with tempfile.NamedTemporaryFile(suffix=".pdb", delete=False, mode='w', encoding='utf-8') as tmp_pdb:
        tmp_pdb.write(pdb_content)
        tmp_pdb_path = tmp_pdb.name

    # Calculate SASA using freesasa by providing the temporary file path
    structure = freesasa.Structure(tmp_pdb_path)
    result = freesasa.calc(structure)

    # Prepare data for CSV
    sasa_data = {} # Use a dictionary to easily aggregate SASA by residue ID
    # Iterate through atoms in the freesasa structure
    for i in range(structure.nAtoms()):
        # Get atom and residue information from the structure object
        chain_id = structure.chainLabel(i)
        residue_name = structure.residueName(i)
        # Corrected method name to residueNumber
        residue_number = structure.residueNumber(i)
        residue_id = f"{chain_id}:{residue_name}:{residue_number}"

        # Get the atom's SASA from the result object
        sasa = result.atomArea(i)

        # Aggregate SASA per residue
        if residue_id not in sasa_data:
            sasa_data[residue_id] = 0.0
        sasa_data[residue_id] += sasa

    # Convert the aggregated data to a list of dictionaries for DataFrame
    sasa_list = [{"Residue ID": res_id, "SASA": total_sasa} for res_id, total_sasa in sasa_data.items()]

    # Create a pandas DataFrame and save to CSV
    sasa_df = pd.DataFrame(sasa_list)
    sasa_df.to_csv("sasa_per_residue.csv", index=False)

    print("SASA calculation complete and results saved to sasa_per_residue.csv")

finally:
    # Clean up the temporary file if it was created
    if tmp_pdb_path and os.path.exists(tmp_pdb_path):
        os.remove(tmp_pdb_path)

SASA calculation complete and results saved to sasa_per_residue.csv


## Csvファイルのダウンロードリンク表示
CSVファイルがダウンロードできます。


In [None]:
from google.colab import files

# 生成されたCSVファイル "sasa_per_residue.csv" のダウンロードリンクを提供します。
files.download("sasa_per_residue.csv")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>