# Getting started with tskit
This is the step-by-step tutorial found [here](https://tskit.dev/tutorials/getting_started.html). Here we generate an alignment using [msprime](https://tskit.dev/msprime/docs/stable/intro.html), which is a python package to generate data to be used with *tskit* stuff

> A number of different software programs can generate tree sequences. For the purposes of this tutorial we’ll use msprime to create an example tree sequence representing the genetic genealogy of a 10Mb chromosome in twenty diploid individuals. To make it a bit more interesting, we’ll simulate the effects of a selective sweep in the middle of the chromosome, then throw some neutral mutations onto the resulting tree sequence.

In [1]:
import msprime

pop_size=10_000
seq_length=10_000_000

sweep_model = msprime.SweepGenicSelection(
    position=seq_length/2, start_frequency=0.0001, end_frequency=0.9999, s=0.25, dt=1e-6)

ts = msprime.sim_ancestry(
    20,
    model=[sweep_model, msprime.StandardCoalescent()],
    population_size=pop_size,
    sequence_length=seq_length,
    recombination_rate=1e-8,
    random_seed=1234,  # only needed for repeatabilty
    )
# Optionally add finite-site mutations to the ts using the Jukes & Cantor model, creating SNPs
ts = msprime.sim_mutations(ts, rate=1e-8, random_seed=4321)
ts

Tree Sequence,Unnamed: 1
Trees,11167
Sequence Length,10000000.0
Time Units,generations
Sample Nodes,40
Total Size,2.4 MiB
Metadata,No Metadata

Table,Rows,Size,Has Metadata
Edges,36372,1.1 MiB,
Individuals,20,584 Bytes,
Migrations,0,8 Bytes,
Mutations,13568,490.3 KiB,
Nodes,7342,200.8 KiB,
Populations,1,224 Bytes,✅
Provenances,2,1.9 KiB,
Sites,13554,330.9 KiB,


We have tousand of trees in `ts` object. We have *20 dyploid* individuals, so 40 nodes (one for genome? have I *two* genomes per individual as described by the tutorial?)

Iterate over the *trees* with the `trees()` method:

In [2]:
for tree in ts.trees():
    print(f"Tree {tree.index} covers {tree.interval}")
    if tree.index >= 4:
        print("...")
        break
print(f"Tree {ts.last().index} covers {ts.last().interval}")

Tree 0 covers Interval(left=0.0, right=661.0)
Tree 1 covers Interval(left=661.0, right=3116.0)
Tree 2 covers Interval(left=3116.0, right=4451.0)
Tree 3 covers Interval(left=4451.0, right=4673.0)
Tree 4 covers Interval(left=4673.0, right=5020.0)
...
Tree 11166 covers Interval(left=9999635.0, right=10000000.0)


There are also `last()` and `first()` methods to access to the *last* and *first* trees respectively. Check if trees coalesce (not always true for [forward simulations](https://tskit.dev/tutorials/forward_sims.html#sec-tskit-forward-simulations))

In [3]:
import time
elapsed = time.time()
for tree in ts.trees():
    if tree.has_multiple_roots:
        print("Tree {tree.index} has not coalesced")
        break
else:
    elapsed = time.time() - elapsed
    print(f"All {ts.num_trees} trees coalesced")
    print(f"Checked in {elapsed:.6g} secs")

All 11167 trees coalesced
Checked in 0.0209928 secs
