# Step 1: Prepare Data

To load the raw molecule data for experiment, do the following:

1. Find the molecular file in mol2 format from the [Protein Data Bank](https://www.rcsb.org/downloads/ligands) (PDB) and download it. In this example, [117 ligand](http://www.rcsb.org/ligand/117) is used.
2. Upload the molecular file to the directory **source/src/molecular-folding/molecule-data**.
3. Update the parameter 'raw_path' to be the relative path of the file.
4. Update the parameters 's3_bucket' and 'prefix', which are used to store the optimization results. 

    **Note**: You can use the S3 bucket name in the CloudFormation output.

In [None]:
from utility.MoleculeParser import MoleculeData
from utility.QMUQUBO import QMUQUBO
from utility.AnnealerOptimizer import Annealer
from utility.ResultProcess import ResultParser
import time

timestamp = time.strftime("%Y%m%d-%H")

In [2]:
# initial parameters for experiment data
s3_bucket = f"xxxx" # the name of the bucket
prefix = "xxxx" # the name of the folder in the bucket

raw_path = './molecule-data/117_ideal.mol2' # the mol2 file for this experiment

In [3]:
mol_data = MoleculeData(raw_path, 'qmu')

data_path = mol_data.save("latest")

num_rotation_bond = mol_data.bond_graph.rb_num
print(f"You have loaded the raw molecule data and saved as {data_path}. \n\
This molecule has {num_rotation_bond} rotable bond")

INFO:root:parse mol2 file!
INFO:root:finish save qmu_117_ideal_data_latest.pickle


You have loaded the raw molecule data and saved as ./qmu_117_ideal_data_latest.pickle. 
This molecule has 23 rotable bond


After running, the result
will be saved as **qmu_117_ideal_data_latest.pickle**
and **data_path** will be updated. The output shows that this molecule has 23 rotatable bonds.