# Getting started with `gos`

This notebook demonstrates the core features of `gos`:

- Authoring declarative genomics visualizations which adhere to the [Gosling](http://gosling-lang.org/) JSON Specification

- Displaying Gosling visualizations directly in notebook


Start by importing `gosling`.

In [None]:
# !pip install gosling==0.0.8
import gosling as gos

> **NOTE** it is a convention to import as `gos` and then access the API through this namespace. 

## Creating a `gos.Track`

**`gos`** exposes two fundemental building-blocks for genomics visualizatinos provided by the Gosling grammar:

- `gos.Track`
- `gos.View`

A _Track_ is the core component of a genomics visualization that defines explict **transformations** and **mappings** of genomics data to **visual properties**. A _Track_ ay be composed with other _Tracks_ or **grouped** into a _View_ that share the same linked genomic domain. 

The figure below depicts the heirarchical structure of a Gosling visualization, displaying **3** distict _Views_ (light orange/blue/green) which individually are composed of several _Tracks_.

<img src="tracks-views.jpeg" width="600">

A _Track_ (dark orange/blue/green) is the base primative for building genomics visualizations and requires binding a data source. In `gos` we define an abstract genomic data source and bind it to a _Track_ directly through the Python API. 

We will start by loading a CSV containing UCSC hg38 cytoband information

In [None]:
data_url = "https://raw.githubusercontent.com/sehilyi/gemini-datasets/master/data/UCSC.HG38.Human.CytoBandIdeogram.bed"

# The dataset is a BED4+1 file which can be read in Gosling as the CSV datatype
data = gos.csv(
    url=data_url,
    headerNames=['chrom', 'chromStart', 'chromEnd', 'name', 'stain'], # the +1 field is stain
    chromosomeField="chrom", # the column containing chrom names
    genomicFields=["chromStart", "chromEnd"], # fields with genomic coordinates
    separator="\t",
)

# bind the data to a track
gos.Track(data)

The _Track_ above is now bound to the genomics data, but we haven't declared how to map the dataset to visual properties. For this, we will use the `gos.Track.mark_*()` and `gos.Track.encode()` methods to specify a **mark** and what **visual encodings** to apply.

In [None]:
track = gos.Track(data).mark_rect().encode(
    # defines start and end of rectangle mark
    x=gos.X("chromStart:G", axis="top"),
    xe=gos.Xe("chromEnd:G"),
    # defines how to map Giemsa-stain factor to colors
    color=gos.Color(
        "stain:N", 
        domain=["gneg", "gpos25", "gpos50", "gpos75", "gpos100", "gvar"],
        range=["white", "#D9D9D9", "#979797", "#636363", "black", "#A0A0F2"]
    ),
    # customize the style of the visual marks. 
    size=gos.value(20),
    stroke=gos.value("gray"),
    strokeWidth=gos.value(0.5)
)

track

Our `gos.Track` now is fully specified, however, the Gosling grammar requires the root of every visualization as a _View_, which may contain one or more _Tracks_. 

In order to complete a Gosling specification for the track in isolation, we use the `gos.Track.view()` method to cast the track within a `gos.View`. In Jupyter or Google Colab, the visualization is automatically rendered in the cell below rather than printing a Python object like above.

In [None]:
track.view() # voila!

Additional parameters for the resulting `gos.View` can be passed in as well for convenience. We can easily set a `title` and `xDomain` for our visualization, initializing the initial genomic region to display "chr1". 

In [None]:
track.view(
    title="Gos is awesome!",
    xDomain=gos.GenomicDomain(chromosome="chr1"),
)

> **NOTE** how we reuse the `track` instance to create new, modified views. This is a very common pattern in **`gos`** and what makes the Python API much more concise than the JSON equivalent.

## Arranging Tracks into Views

Tracks are arranged can be arraged as separate views via the `gos.horizontal` and `gos.vertical` layout utilites. Here we reuse track defintion about to into a multiview layout. 

In [None]:
gos.horizontal(
    # right, vertically stacked tracks
    gos.vertical(
        track.encode().properties(width=300, height=100),
        track.properties(width=300, height=100)
    ),
    # left, new track with alternative colormapping
    track.encode(
        color=gos.Color(
            "stain:N", 
            domain=["gneg", "gpos25", "gpos50", "gpos75", "gpos100", "gvar"],
            range=["white", "#D9D9D9", "#979797", "#636363", "black", "#FF8F00"] # change gvar to orange
        ),
    ).properties(width=600, height=240)
)

This visualization above is not very useful or informative, but moreover it is meant to introduce how features of the Gos API compose together to create sophisticated interactive genomics visualizations. 

You can read more about [Gosling](http://gosling-lang.org/) to learn about exciting grammar features which are avaialbe in **gos** and also check out the **gos** [documenation](https://gosling-lang.github.io/gos/gallery/index.html) for more complex examples.