# Graphein Protein Structure Dataloaders
## PyTorch Geometric Datasets

[API Reference](https://graphein.ai/modules/graphein.ml.html#graphein.ml.datasets.torch_geometric_dataset)

Graphein provides three dataset classes for working with [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/):

* [`ProteinGraphDataset`]() - For processing large datasets that can't be kept in memory
* [`InMemoryProteinGraphDataset`]() - For smaller datasets that can be kept in memory
* [`ProteinGraphListDataset`]() - For creating a dataset from a list of pre-computed PyTorch Geometric graphs.

Both `ProteinGraphDataset` and `InMemoryGraphDataset` will take care of downloading structures from either the [RCSB PDB](https://www.rcsb.org/), [EBI AlphaFold database](https://alphafold.com/), or both!
`ProteinGraphListDataset` is a lightweight alternative for creating a dataset from a collection of graphs you have pre-computed.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/a-r-j/graphein/blob/master/notebooks/dataloader_tutorial.ipynb)

In [9]:
# Install graphein if necessary
# !pip install graphein

# Install torch if necessary. See https://pytorch.org/get-started/locally/
# pip install torch==1.11.0

# Install torch geometric if necessary. See: https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html
# pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.11.0+cpu.html

### ProteinGraphDataset

[API Reference](https://graphein.ai/modules/graphein.ml.html#graphein.ml.datasets.torch_geometric_dataset.ProteinGraphDataset)

`ProteinGraphDataset` will download structures from the PDB/AlphafoldDB, process the structures into graphs according to a `ProteinGraphConfig`.

#### Parameters
```python
ProteinGraphDataset(
        root: str,                                                             
        # Root directory where the dataset should be saved.
        name: str,                                                             
        # Name of the dataset. Will be saved to ``data_$name.pt``.
        pdb_paths:Optional[List[str]] =None,
        # List of full path of pdb files to load.
        pdb_codes: Optional[List[str]] = None,                                 
        #  List of PDB codes to download and parse from the PDB.
        uniprot_ids: Optional[List[str]] = None,                               
        # List of Uniprot IDs to download and parse from Alphafold Database
        graph_label_map: Optional[Dict[str, torch.Tensor]] = None,             
        # Dictionary mapping PDB/Uniprot IDs to graph-level labels.
        node_label_map: Optional[Dict[str, torch.Tensor]] = None,              
        # Dictionary mapping PDB/Uniprot IDs to node-level labels.
        chain_selection_map: Optional[Dict[str, List[str]]] = None,            
        # Dictionary mapping PDB/Uniprot IDs to the desired chains in the PDB files
        graphein_config: ProteinGraphConfig = ProteinGraphConfig(),            
        # Protein graph construction config
        graph_format_convertor: GraphFormatConvertor = GraphFormatConvertor(   
            src_format="nx", dst_format="pyg"
        ),
        # Conversion handler for graphs
        graph_transformation_funcs: Optional[List[Callable]] = None,           
        # List of functions that consume a nx.Graph and return a nx.Graph. Applied to graphs after construction but before conversion to pyg
        transform: Optional[Callable] = None,                                  
        # A function/transform that takes in a torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.
        pdb_transform: Optional[List[Callable]] = None,
        pre_transform: Optional[Callable] = None,                              
        # A function/transform that takes in a torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk
        pre_filter: Optional[Callable] = None,                                 
        # A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset
        num_cores: int = 16,                                                   
        # Number of cores to use for multiprocessing of graph construction
        af_version: int = 2,                                                   
        #  Version of AlphaFoldDB structures to use,
    )
```


#### Directory Structure
Creating a ``ProteinGraphDataset`` will create two directories under ``root``:

* ``root/raw`` - Contains raw PDB files which are downloaded
* ``root/processed`` - Contains processed graphs (in ``pytorch_geometric.data.Data`` format) saved as ``$PDB.pt / $UNIPROT_ID.pt``

In [10]:
import torch
from graphein.ml import ProteinGraphDataset
import graphein.protein as gp

# Create some labels
g_labels = torch.randn([5])
n_labels = torch.randn([5, 10])

g_lab_map = {"3eiy": g_labels[0], "4hhb": g_labels[1], "Q5VSL9": g_labels[2], "1lds": g_labels[3], "Q8W3K0": g_labels[4]}
node_lab_map = {"3eiy": n_labels[0], "4hhb": n_labels[1], "Q5VSL9": n_labels[2], "1lds": n_labels[3], "Q8W3K0": n_labels[4]}

# Select some chains
chain_selection_map = {"4hhb": "A"}


# Create the dataset
ds = ProteinGraphDataset(
    root = "../graphein/ml/datasets/test",
    pdb_codes=["3eiy", "4hhb", "1lds"],
    uniprot_ids=["Q5VSL9", "Q8W3K0"],
    graph_label_map=g_lab_map,
    node_label_map=node_lab_map,
    chain_selection_map=chain_selection_map,
    graphein_config=gp.ProteinGraphConfig()
)

In [11]:
# Create a dataloader from dataset and inspect a batch
from torch_geometric.loader import DataLoader

dl = DataLoader(ds, batch_size=2, shuffle=True, drop_last=True)
for i in dl:
    print(i)
    print("Graph labels: ", i.graph_y)
    print("Node labels: ", i.node_y)
    break

DataBatch(edge_index=[2, 236], node_id=[2], coords=[2], name=[2], dist_mat=[2], num_nodes=238, graph_y=[2], node_y=[20], batch=[238], ptr=[3])
Graph labels:  tensor([ 0.5660, -0.7161])
Node labels:  tensor([-1.2430,  0.8221, -0.0296, -0.3522,  1.7685, -2.3006, -0.1209, -1.4377,
        -1.2816, -0.7039, -0.8580, -0.5647, -1.6848, -1.5069, -2.8355, -0.4000,
         0.3203,  0.1497, -1.0708,  0.3418])


#### Load from local path


Creating a ``ProteinGraphDataset`` from a list of full path of pdb files:

* ``root/raw`` - Will be empty since no pdb files are downloaded
* ``root/processed`` - Contains processed graphs (in ``pytorch_geometric.data.Data`` format) saved as ``$PDB.pt / $UNIPROT_ID.pt``

In [12]:
# import sys
# sys.path.append('../')  # add system path for python

import os 
from graphein.protein.config import ProteinGraphConfig
from graphein.ml import ProteinGraphDataset, ProteinGraphListDataset
import torch 

local_dir = "../tests/protein/test_data/"
pdb_paths = [os.path.join(local_dir, pdb_path) for pdb_path in os.listdir(local_dir) if pdb_path.endswith(".pdb")]
print(pdb_paths)

# let's load local dataset from local_dir!
ds = ProteinGraphDataset(
    root = "../graphein/ml/datasets/test",
    pdb_paths = pdb_paths,
    graphein_config=ProteinGraphConfig(),
)

['../tests/protein/test_data/1lds.pdb', '../tests/protein/test_data/4hhb.pdb', '../tests/protein/test_data/alphafold_structure.pdb']


In [17]:
# Create a dataloader from dataset and inspect a batch
from torch_geometric.loader import DataLoader
dl = DataLoader(ds, batch_size=2, shuffle=True, drop_last=True)
for i in dl:
    print(i)
    break

DataBatch(edge_index=[2, 666], node_id=[2], coords=[2], name=[2], dist_mat=[2], num_nodes=671, batch=[671], ptr=[3])


### InMemoryProteinGraphDataset

[API Reference](https://graphein.ai/modules/graphein.ml.html#graphein.ml.datasets.torch_geometric_dataset.InMemoryProteinGraphDataset)

#### Parameters
```python
InMemoryProteinGraphDataset(
        root: str,                                                             
        # Root directory where the dataset should be saved.
        name: str,                                                             
        # Name of the dataset. Will be saved to ``data_$name.pt``.
        pdb_paths:Optional[List[str]] =None,
        # List of full path of pdb files to load.
        pdb_codes: Optional[List[str]] = None,                                 
        #  List of PDB codes to download and parse from the PDB.
        uniprot_ids: Optional[List[str]] = None,                               
        # List of Uniprot IDs to download and parse from Alphafold Database
        graph_label_map: Optional[Dict[str, torch.Tensor]] = None,             
        # Dictionary mapping PDB/Uniprot IDs to graph-level labels.
        node_label_map: Optional[Dict[str, torch.Tensor]] = None,              
        # Dictionary mapping PDB/Uniprot IDs to node-level labels.
        chain_selection_map: Optional[Dict[str, List[str]]] = None,            
        # Dictionary mapping PDB/Uniprot IDs to the desired chains in the PDB files
        graphein_config: ProteinGraphConfig = ProteinGraphConfig(),            
        # Protein graph construction config
        graph_format_convertor: GraphFormatConvertor = GraphFormatConvertor(   
            src_format="nx", dst_format="pyg"
        ),
        # Conversion handler for graphs
        graph_transformation_funcs: Optional[List[Callable]] = None,           
        # List of functions that consume a nx.Graph and return a nx.Graph. Applied to graphs after construction but before conversion to pyg
        transform: Optional[Callable] = None,                                  
        # A function/transform that takes in a torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.
        pdb_transform: Optional[List[Callable]] = None,
        pre_transform: Optional[Callable] = None,                              
        # A function/transform that takes in a torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk
        pre_filter: Optional[Callable] = None,                                 
        # A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset
        num_cores: int = 16,                                                   
        # Number of cores to use for multiprocessing of graph construction
        af_version: int = 2,                                                   
        #  Version of AlphaFoldDB structures to use,
    )
```

#### Directory Structure
Creating an ``InMemoryProteinGraphDataset`` will create two directories under ``root``:
* ``root/raw`` - Contains raw PDB files
* ``root/processed`` - Contains processed datasets saved as ``data_{name}.pt``

In [9]:
from graphein.ml import InMemoryProteinGraphDataset

g_lab_map = {"3eiy": 1, "4hhb": 2, "Q5VSL9": 3, "1lds": 10, "2ll6": 4}
node_lab_map = {"3eiy": 1, "4hhb": 2, "Q5VSL9": 3, "1lds": 10, "2ll6": 4}
chain_selection_map = {"4hhb": "A"}

ds = InMemoryProteinGraphDataset(
    root = "../graphein/ml/datasets/test",
    name="test",
    pdb_codes=["3eiy", "4hhb", "1lds", "2ll6"],
    uniprot_ids=["Q5VSL9"],
    graph_label_map=g_lab_map,
    node_label_map=node_lab_map,
    chain_selection_map=chain_selection_map
)

Processing...
To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
INFO:graphein.protein.graphs:Constructing graph for: ../graphein/ml/datasets/test/raw/4hhb.pdb. Chain selection: A
INFO:graphein.protein.graphs:Constructing graph for: ../graphein/ml/datasets/test/raw/Q5VSL9.pdb. Chain selection: all
INFO:graphein.protein.graphs:Constructing graph for: ../graphein/ml/datasets/test/raw/1lds.pdb. Chain selection: all
INFO:graphein.protein.graphs:Constructing graph for: ../graphein/ml/datasets/test/raw/2ll6.pdb. Chain selection: all
INFO:graphein.protein.graphs:Constructing graph for: ../graphein/ml/datasets/test/raw/3eiy.pdb. Chain selection: all
DEBUG:graphein.protein.graphs:D

In [5]:
# Create a dataloader from dataset and inspect a batch
dl = DataLoader(ds, batch_size=2, shuffle=True, drop_last=True)
for i in dl:
    print(i)
    break

DataBatch(edge_index=[2, 236], node_id=[2], coords=[2], name=[2], dist_mat=[2], graph_y=[2], node_y=[2], num_nodes=238, batch=[238], ptr=[3])


#### Load from local path


Creating an ``InMemoryProteinGraphDataset`` from a list of full path of pdb files:

* ``root/raw`` - Will be empty since no pdb files are downloaded
* ``root/processed`` - Contains processed datasets saved as ``data_{name}.pt``


In [19]:
from graphein.ml.datasets.torch_geometric_dataset import InMemoryProteinGraphDataset


local_dir = "../tests/protein/test_data/"
pdb_paths = [os.path.join(local_dir, pdb_path) for pdb_path in os.listdir(local_dir) if pdb_path.endswith(".pdb")]
print(pdb_paths)

# let's load local dataset from local_dir!
ds = InMemoryProteinGraphDataset(
    root = "../graphein/ml/datasets/test",
    name = "test",
    pdb_paths = pdb_paths,
)


['../tests/protein/test_data/1lds.pdb', '../tests/protein/test_data/4hhb.pdb', '../tests/protein/test_data/alphafold_structure.pdb']
Constructing Graphs...


Processing...


  0%|          | 0/3 [00:00<?, ?it/s]

Converting Graphs...
Saving Data...
Done!


Done!


In [20]:
# Create a dataloader from dataset and inspect a batch
dl = DataLoader(ds, batch_size=2, shuffle=True, drop_last=True)
for i in dl:
    print(i)
    break

DataBatch(edge_index=[2, 951], node_id=[2], coords=[2], name=[2], dist_mat=[2], num_nodes=956, batch=[956], ptr=[3])


### ProteinGraphListDataset

[API Reference](https://graphein.ai/modules/graphein.ml.html#graphein.ml.datasets.torch_geometric_dataset.ProteinGraphListDataset)

The `ProteinGraphListDataset` class is a lightweight class for wrapping a list of pre-computed `pytorch_geometric.data.Data` graphs.

#### Parameters

```python
ProteinGraphListDataset(
    root: str,                              # Root directory where the dataset is stored.
    data_list: List[Data],                  # List of protein graphs as PyTorch Geometric Data objects.
    name: str,                              # Name of dataset. Data will be saved as ``data_{name}.pt``.
    transform: Optional[Callable]=None      # A function/transform that takes in a torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.
    )
```

In [41]:
from graphein.ml import ProteinGraphListDataset, GraphFormatConvertor
import graphein.protein as gp

# Construct graphs
graphs = gp.construct_graphs_mp(
    pdb_code_it=["3eiy", "4hhb", "1lds", "2ll6"],
    return_dict=False
    )

# do some transformation
graphs = [gp.extract_subgraph_from_chains(g, ["A"]) for g in graphs]

# Convert to PyG Data format
convertor = GraphFormatConvertor(src_format="nx", dst_format="pyg")
graphs = [convertor(g) for g in graphs]

# Create dataset
ds = ProteinGraphListDataset(root=".", data_list=graphs, name="list_test")

To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
INFO:graphein.protein.graphs:Constructing graph for: 4hhb. Chain selection: all
INFO:graphein.protein.graphs:Constructing graph for: 3eiy. Chain selection: all
INFO:graphein.protein.graphs:Constructing graph for: 1lds. Chain selection: all
INFO:graphein.protein.graphs:Constructing graph for: 2ll6. Chain selection: all
DEBUG:graphein.protein.graphs:Deprotonating protein. This removes H atoms from the pdb_df dataframe
DEBUG:graphein.protein.graphs:Deprotonating protein. This removes H atoms from the pdb_df dataframe
DEBUG:graphein.protein.graphs:Detected 174 total nodes
DEBUG:graphein.protein.graphs:Detected 97 total nodes
DEBUG:graphein.protein.features.nodes.amino_acid:Reading meiler embeddings 

174
97


DEBUG:graphein.protein.graphs:Deprotonating protein. This removes H atoms from the pdb_df dataframe
DEBUG:graphein.protein.graphs:Detected 574 total nodes
DEBUG:graphein.protein.features.nodes.amino_acid:Reading meiler embeddings from: /Users/arianjamasb/github/graphein/graphein/protein/features/nodes/meiler_embeddings.csv


574


DEBUG:graphein.protein.graphs:Deprotonating protein. This removes H atoms from the pdb_df dataframe
DEBUG:graphein.protein.graphs:Detected 165 total nodes
DEBUG:graphein.protein.features.nodes.amino_acid:Reading meiler embeddings from: /Users/arianjamasb/github/graphein/graphein/protein/features/nodes/meiler_embeddings.csv


165


DEBUG:graphein.protein.subgraphs:Found 174 nodes in the chain subgraph.
DEBUG:graphein.protein.subgraphs:Creating subgraph from nodes: ['A:VAL:107', 'A:SER:26', 'A:PHE:45', 'A:PRO:53', 'A:VAL:29', 'A:ALA:144', 'A:LEU:39', 'A:GLN:14', 'A:TYR:56', 'A:PRO:69', 'A:MET:117', 'A:ASP:98', 'A:SER:123', 'A:VAL:72', 'A:PRO:60', 'A:ILE:135', 'A:THR:48', 'A:ALA:33', 'A:PHE:78', 'A:ALA:108', 'A:ASN:55', 'A:ARG:87', 'A:GLU:21', 'A:THR:167', 'A:GLU:27', 'A:MET:96', 'A:ALA:119', 'A:VAL:74', 'A:TYR:31', 'A:LEU:63', 'A:VAL:114', 'A:ALA:36', 'A:ASP:168', 'A:GLY:83', 'A:PHE:173', 'A:GLN:25', 'A:TRP:156', 'A:PHE:16', 'A:ASP:71', 'A:LEU:94', 'A:ASP:112', 'A:LEU:12', 'A:GLY:65', 'A:LEU:81', 'A:PRO:23', 'A:VAL:102', 'A:ALA:8', 'A:LYS:136', 'A:GLU:140', 'A:ILE:22', 'A:LYS:35', 'A:ALA:82', 'A:GLU:165', 'A:PHE:139', 'A:LEU:80', 'A:LYS:152', 'A:GLY:38', 'A:GLN:61', 'A:SER:2', 'A:GLY:169', 'A:LEU:106', 'A:ASP:157', 'A:ASN:172', 'A:LYS:10', 'A:VAL:18', 'A:ARG:44', 'A:GLU:99', 'A:ASP:125', 'A:ILE:19', 'A:ILE:159', '

In [54]:
for i in ds:
    print(i)

Data(edge_index=[2, 173], node_id=[174], coords=[1], name=[1], dist_mat=[1], num_nodes=174)
Data(edge_index=[2, 140], node_id=[141], coords=[1], name=[1], dist_mat=[1], num_nodes=141)
Data(edge_index=[2, 96], node_id=[97], coords=[1], name=[1], dist_mat=[1], num_nodes=97)
Data(edge_index=[2, 147], node_id=[148], coords=[1], name=[1], dist_mat=[1], num_nodes=148)


In [6]:
# Create a dataloader from dataset and inspect a few batches
dl = DataLoader(ds, batch_size=2, shuffle=True, drop_last=False)
for i in dl:
    print(i)

DataBatch(edge_index=[2, 303], node_id=[2], coords=[2], name=[2], dist_mat=[2], graph_y=[2], node_y=[2], num_nodes=306, batch=[306], ptr=[3])
DataBatch(edge_index=[2, 1009], node_id=[2], coords=[2], name=[2], dist_mat=[2], graph_y=[2], node_y=[2], num_nodes=1011, batch=[1011], ptr=[3])
DataBatch(edge_index=[2, 96], node_id=[1], coords=[1], name=[1], dist_mat=[1], graph_y=[1], node_y=[1], num_nodes=97, batch=[97], ptr=[2])


### Transforms

We can supply various functions to `ProteinGraphDataset` and `InMemoryProteinGraphDataset` to alter the composition of the dataset.

* ``pdb_transform`` (``list(callable)``, optional) - A function that receives a list of paths to the downloaded structures. This provides an entry point to apply pre-processing from bioinformatics tools of your choosing

* ``graph_transformation_funcs``: (``List[Callable]``, optional) List of functions that consume a ``nx.Graph`` and return a ``nx.Graph``. Applied to graphs after construction but before conversion to ``torch_geometric.data.Data``. Defaults to ``None``.

* ``transform`` (``callable``, optional) – A function/transform that takes in a ``torch_geometric.data.Data`` object and returns a transformed version. The data object will be transformed before every access. (default: ``None``)

* ``pre_transform`` (``callable``, optional) – A function/transform that takes in a torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: ``None``)

* ``pre_filter`` (``callable,`` optional) – A function that takes in a ``torch_geometric.data.Data`` object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default: ``None``)

In [3]:
from typing import List
import networkx as nx
from torch_geometric.data import Data

# Create dummy transforms
def pdb_transform_fn(files: List[str]):
    """Transforms raw pdbs prior to computing graphs."""
    return

def graph_transform_fn(graph: nx.Graph) -> nx.Graph:
    """Transforms graphein nx.Graph prior to conversion to torch_geometric.data.Data."""
    return graph

def transform_fn(data: Data) -> Data:
    """Transforms torch_geometric.data.Data prior to every access."""
    return data

def pre_transform_fn(data: Data) -> Data:
    """Transforms torch_geometric.data.Data prior to saving to disk."""
    return data

def pre_filter_fn(data: Data) -> bool:
    """Takes in a torch_geometric.data.Data and returns True if the data should be included in the dataset."""
    return True

In [2]:
from graphein.ml.datasets.torch_geometric_dataset import InMemoryProteinGraphDataset

g_lab_map = {"3eiy": 1, "4hhb": 2, "Q5VSL9": 3, "1lds": 10, "2ll6": 4}
node_lab_map = {"3eiy": 1, "4hhb": 2, "Q5VSL9": 3, "1lds": 10, "2ll6": 4}
chain_selection_map = {"4hhb": "A"}

ds = InMemoryProteinGraphDataset(
    root = "../graphein/ml/datasets/test",
    name="test",
    pdb_codes=["3eiy", "4hhb", "1lds", "2ll6"],
    uniprot_ids=["Q5VSL9"],
    graph_label_map=g_lab_map,
    node_label_map=node_lab_map,
    chain_selection_map=chain_selection_map,
    pdb_transform=[pdb_transform_fn],
    graph_transformation_funcs=[graph_transform_fn],
    transform=transform_fn,
    pre_transform=pre_transform_fn,
    pre_filter=pre_filter_fn
)

To do so, use the following command: conda install -c pytorch3d pytorch3d
  0%|          | 0/4 [00:00<?, ?it/s]

Downloading PDB structure '3eiy'...


INFO:graphein.protein.utils:Downloaded PDB file for: 3eiy
 25%|██▌       | 1/4 [00:01<00:04,  1.52s/it]

Downloading PDB structure '4hhb'...


INFO:graphein.protein.utils:Downloaded PDB file for: 4hhb
 50%|█████     | 2/4 [00:03<00:03,  1.61s/it]

Downloading PDB structure '1lds'...


INFO:graphein.protein.utils:Downloaded PDB file for: 1lds
 75%|███████▌  | 3/4 [00:04<00:01,  1.51s/it]

Downloading PDB structure '2ll6'...


INFO:graphein.protein.utils:Downloaded PDB file for: 2ll6
100%|██████████| 4/4 [00:06<00:00,  1.66s/it]
  0%|          | 0/1 [00:00<?, ?it/s]INFO:graphein.protein.utils:Downloaded AlphaFold PDB file for: Q5VSL9
100%|██████████| 1/1 [00:00<00:00,  8.83it/s]
Processing...
To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
To do so, use the following command: conda install -c pytorch3d pytorch3d
INFO:graphein.protein.graphs:Constructing graph for: ../graphein/ml/datasets/test/raw/Q5VSL9.pdb. Chain selection: all
INFO:graphein.protein.graphs:Constructing graph for: ../graphein/ml/datasets/test/raw/1lds.pdb. Chain selection: all
INFO:graphein.protein.graphs:Constructing graph for: ../graphein/ml/datasets/test/raw/3eiy.pdb. Chain selection: all
INFO:g