## Step-by-step instructions to download HEST-1k 

This tutorial will guide you to:

- Download HEST-1k in its entirety (scanpy, whole-slide images, patches, nuclear segmentation, alignment preview)
- Download some samples of HEST-1k 
- Download samples with some attributes (e.g., all breast cancer cases) 
- Inspect fr

### Download HEST-1k 

In [None]:
from huggingface_hub import snapshot_download

local_dir='../hest_data' # hest will be dowloaded to this folder
snapshot_download(repo_id="MahmoodLab/hest", repo_type='dataset', local_dir=local_dir)


### Download HEST-1k based on sample IDs

In [None]:
from huggingface_hub import snapshot_download

local_dir='../hest_data' # hest will be dowloaded to this folder
ids_to_query = ['TENX96', 'TENX99'] # list of ids to query

list_patterns = [f"*{id}[_.]**" for id in ids_to_query]
snapshot_download(repo_id="MahmoodLab/hest", repo_type='dataset', local_dir=local_dir, allow_patterns=list_patterns)


### Download HEST-1k based on queries (e.g., organ, technology, oncotree code)

In [None]:
from huggingface_hub import snapshot_download

local_dir='../hest_data' # hest will be dowloaded to this folder

meta_df = pd.read_csv("../metadata/HEST_v1_0_0.csv")

# Filter the dataframe by organ, oncotree code...
meta_df = meta_df[meta_df['oncotree_code'] == 'IDC']
meta_df = meta_df[meta_df['organ'] == 'Breast']

ids_to_query = meta_df['id'].values

list_patterns = [f"*{id}[_.]**" for id in ids_to_query]
snapshot_download(repo_id="MahmoodLab/hest", repo_type='dataset', local_dir=local_dir, allow_patterns=list_patterns)

### Inspect freshly downloaded samples

In [None]:
from hest import load_hest

print('load hest...')
hest_d = load_hest('../hest_data') # location of the data
print('loaded hest')
for d in hest_d:
    print(d)