# FFF Workshop

## A4: Selections and interactions

### Outline

- Selecting sets of compounds and poses
- Working with dataframes


- Subsite selection
- Subsite assignment
	- From canonsite
	- Manual
	- Posebutcher?
- Interaction profiling
- Interaction visualisation

## Selecting sets of compounds and poses

Below are some examples of how selections of compounds and poses can be created

In [None]:
# setup the animal
%load_ext autoreload
%autoreload 2
# from mrich import print
import hippo
animal = hippo.HIPPO(
    "A71EV2A_demo",
    "../data/A71EV2A.sqlite",
)

In [None]:
# by tag
soaks = animal.compounds(tag="soaks")
print(soaks)

# by ids
cset = animal.compounds[1,3,4,6,12,43]
print("compounds with ID's 1,3,4,6,12,43:", cset)

# from set of poses
hit_compounds = animal.poses(tag="hits").compounds
print("hit compounds:", hit_compounds)

# by metadata info
double_soaks = hit_compounds.get_by_metadata(key="SoakDB count", value=2)
print("double soaks:", double_soaks)

Some boolean operations are also supported:

In [None]:
# add two sets:
triple_soaks = hit_compounds.get_by_metadata(key="SoakDB count", value=3)
selection = double_soaks + triple_soaks
print("double and triple soaks:", selection)

# subtract:
print("hit_compounds - double_soaks:", hit_compounds - double_soaks)

# single addition:
print("double_soaks + C3:", double_soaks + animal.C3)

# set intersection
print("hit_compounds & double_soaks:", hit_compounds & double_soaks)

# set union
print("hit_compounds | double_soaks:", hit_compounds | double_soaks)

# set exclusive OR
print("hit_compounds ^ double_soaks:", hit_compounds ^ double_soaks)

Once you are happy with a selection you can add a tag for easier future retrieval:

In [None]:
selection.add_tag("double and triple soaks")

See also the API reference documentation:

- [CompoundTable](https://hippo-docs.winokan.com/en/latest/compounds.html#hippo.cset.CompoundTable) (i.e. `animal.compounds`)
- [CompoundSet](https://hippo-docs.winokan.com/en/latest/compounds.html#hippo.cset.CompoundSet) (i.e. a set of compounds)
- [PoseTable](https://hippo-docs.winokan.com/en/latest/poses.html#hippo.pset.PoseTable) (i.e. `animal.poses`)
- [PoseSet](https://hippo-docs.winokan.com/en/latest/poses.html#hippo.pset.PoseSet) (i.e. a set of poses)

## Working with DataFrames

`pandas.DataFrame` objects are a popular way to work with tabular data, and if you are familiar with it you can get DataFrame representations of HIPPO objects as well:

In [None]:
df = selection.get_df()
df.head()

In [None]:
# more options are available to add more columns:
df = selection.get_df(
    smiles=True,
    inchikey=True,
    alias=True,
    metadata=True,
    expand_metadata=True,
    num_poses=True,
)
df.head()

See also the API reference:

- [CompoundSet.get_df()](https://hippo-docs.winokan.com/en/latest/compounds.html#hippo.cset.CompoundSet.get_df)
- [PoseSet.get_df()](https://hippo-docs.winokan.com/en/latest/poses.html#hippo.pset.PoseSet.get_df)

Or try the `help()` function:

In [None]:
help(selection.get_df)

Once you have a dataframe you can perform all kinds of filtering operations and also add columns from other sources

In [None]:
# select compounds with multiple poses
filtered_df = df[df["num_poses"] > 1]

# sort compounds by SoakDB count
sorted_df = filtered_df.sort_values(by="SoakDB count", ascending=False)

sorted_df.head()

You can then get back to a HIPPO `CompoundSet`:

In [None]:
# simplest way:
cset = animal.compounds[sorted_df.index]

# to maintain custom ordering:
cset = animal.compounds(ids=list(sorted_df.index), sort=False)

cset.interactive()

## Working with HIPPO subsites

HIPPO subsite objects are an additional way to group `Pose` objects by arbitrary 'pockets' or 'subsites'.

They can be assigned from Fragalysis'/XCA Canonical sites for example:

In [None]:
hits = animal.poses(tag="hits")
hits.set_subsites_from_metadata_field()