## Warming up: what will you need going forward with Planetary Computer Hub?

The Planetary Computer Hub development environment relies on open-source tools to work with the data. In our effort to welcome everyone whom may not be so familiarized on using Python tools for academic research, we present an overview on those and some reference guides.

### Tools landscape

### Retrieving collections

#### Pystac
[](https://pystac.readthedocs.io/en/stable/tutorials/pystac-introduction.html)

#### Pystac-client
[](https://pystac-client.readthedocs.io/en/latest/)

### Data manipulation

#### Pandas

Pandas is the go-to library for handling tabular data in a neat, consistent and programmatic way.
[Project Pythia's introduction to Pandas](https://foundations.projectpythia.org/core/pandas/pandas.html) covers the basics of what you will need, such as slicing and performing an exploratory analysis on _DataFrames_ and _DataSeries_.
But the [Pandas official documentation](https://pandas.pydata.org/docs/index.html) is thorough and leaves nothing to be desired.

#### Geopandas

> GeoPandas, as the name suggests, extends the popular data science library pandas by adding support for geospatial data.
>
> \- [GeoPandas documentation](https://geopandas.org/en/stable/getting_started/introduction.html#Concepts)

The reason we choose to use GeoPandas is that it allows us to handle points, lines, curves and polygons (a.k.a., [vectors](https://en.wikipedia.org/wiki/Vector_graphics)). 
Then we can reference these shapes, packed with their own associated data, to a [Coordinate Reference System](https://en.wikipedia.org/wiki/Spatial_reference_system) (CRS). 
Under CRS, you will often find different [EPSG](https://en.wikipedia.org/wiki/EPSG_Geodetic_Parameter_Dataset) codes which represent different projections of our planet. 
This way, GeoPandas allows us to scale into delimiting areas on a map which are loaded with our data.
The most common example for this kind of usage are the classification maps.

#### Xarray

Xarray is our tool of choice for dealing with images and other [raster](https://en.wikipedia.org/wiki/Raster_graphics) data sources. But [Why Xarray?](https://docs.xarray.dev/en/stable/getting-started-guide/why-xarray.html)

> Xarray introduces labels in the form of dimensions, coordinates and attributes on top of raw NumPy-like multidimensional arrays, which allows for a more intuitive, more concise, and less error-prone developer experience.

Xarray borrows heavily from Pandas, which enhances our human productivity by setting us up with a wide-spread interface. 
Also, it tightly integrates with Dask which in turn enhances our computing power through parallel processing.

For those looking for an introduction on _DataArrays_ and _DataSets_ with Xarray, we recommend heading to [Project Pythia's introduction to Xarray](https://foundations.projectpythia.org/core/xarray/xarray-intro.html) given its bias towards science. 
Nonetheless, it's always good to keep in mind the [Xarray's official documentation](https://docs.xarray.dev/en/stable/index.html) for browsing its full spectrum.

### Computing power

#### Dask
[](https://www.dask.org/get-started)