# 2.1 Getting genotype data
We are going to use Poseidon (https://poseidon-framework.github.io/#/) to easily retrieve genotype data together with some useful annotation. The tool for accessing the Poseidon package repository is named `trident`, and if you followed the recommendation for installing the conda environment above, you should have it installed already. To make sure, check with `which trident` and `trident --version`.

`trident` is a command line tool to manage Poseidon packages. Here we'll use it to automatically download packages that we need for this session. You can list all available packages like so:

In [None]:
!trident list --remote --packages

Here we specifically need packages `2012_PattersonGenetics`, `2014_LazaridisNature` and `2019_Jeong_InnerEurasia`, which contain a lot of present-day individuals from around the world, and `2014_RaghavanNature`, which contains a famous 22,000 year old individual from Siberia. Let's fetch those packages and copy them into a local folder called `session_2/poseidon-repository` within this repository:

In [None]:
!mkdir -p scratch/poseidon-repository
# This will take a few seconds to pull the data from the server
!trident fetch -d scratch/poseidon-repository -f "*2012_PattersonGenetics*,*2014_LazaridisNature*,*2019_Jeong_InnerEurasia*,*2014_RaghavanNature*"

Great, now we have those packages. You can checkout the files, e.g.:

In [None]:
!ls scratch/poseidon-repository/2014_LazaridisNature

And you can see three genotype files (`.bed`, `.bim` and `.fam`) and an annotation file ending with `.janno`.

You can also view lots of things about those packages using `trident`. For example:

In [None]:
!trident list --groups -d scratch/poseidon-repository/

or:

In [None]:
!trident summarise -d scratch/poseidon-repository

OK, for further analysis we want to merge these two packages. In `trident` we can use the `forge` command for that. But we first need a population list to know what we like to extract and merge. For this session, such a list is already provided, named `forge_file.txt`.

Let's look at the `forge_file.txt`:

In [None]:
!head forge_file.txt

OK, so there are many population names here, here is how many:

In [None]:
!wc -l forge_file.txt

So 120 populations. Let's use them to forge a new Poseidon package that contains only the genotype and metadata for individuals that belong to one of these 120 groups. Forge takes a number of options (check them out using `trident forge --help`), here we're just using a basic sequence of options (note this will take a few minutes):

In [None]:
!trident forge -d scratch/poseidon-repository -o scratch/forged_package -n PCA_package_1 --forgeFile forge_file.txt --intersect --eigenstrat 

Congratulations. You now have your genotype data ready to be used to compute PCAs and friends.