# Introduction

Topics included in this notebook:
* [Overview of OmniSci](#overview)
* [Tools available for Data Science](#available_tools)
* [Installation](#installation)

<a id='overview'></a>
## Overview of OmniSci


OmniSci is an analytics platform designed to handle very large datasets. It leverages the processing power of GPUs alongside traditional CPUs to achieve very high performance. OmniSci combines an open-source SQL engine (OmniSciDB), server-side rendering (OmniSci Render), and web-based data visualization (OmniSci Immerse) to provide a comprehensive platform for data analysis.

**OmniSciDB**   
The foundation of the platform is OmniSciDB, an open-source, GPU-accelerated database. OmniSciDB harnesses GPU processing power and returns SQL query results in milliseconds, even on tables with billions of rows. OmniSciDB delivers high performance with rapid query compilation, query vectorization, and advanced memory management.  

**Native SQL**    
With native SQL support, OmniSciDB returns query results hundreds of times faster than CPU-only analytical database platforms. Use your existing SQL knowledge to query data. You can use the standalone SQL engine with the command line, or the SQL editor that is part of the OmniSci Immerse visual analytics interface. Your SQL query results can output to OmniSci Immerse or to third-party software such as Birst, Power BI, Qlik, or Tableau.   

**Geospatial Data**  
OmniSciDB can store and query data using native Open Geospatial Consortium (OGC) types, including POINT, LINESTRING, POLYGON, and MULTIPOLYGON. With geo type support, you can query geo data at scale using special geospatial functions. Using the power of GPU processing, you can quickly and interactively calculate distances between two points and intersections between objects.  

**Open Source**  
OmniSciDB is open source and encourages contribution and innovation from a global community of users. It is available on Github under the Apache 2.0 license, along with components like a Python interface (pymapd) and JavaScript infrastructure (mapd-connector, mapd-charting), making OmniSci the leader in open-source analytics.  

<a id='available_tools'></a>
## Tools Available for Data Science

If you are an OmniSci open source edition user, you will not have access to Immerse, but you can still explore OmniSci with the Data Science Foundation tools.   

**Ibis**  
Ibis is a productivity API for working in Python and analyzing data in remote SQL-based data stores such as OmniSciDB. Inspired by the pandas toolkit for data analysis, Ibis provides a Pythonic API that compiles to SQL. Combined with OmniSciDB scale and speed, Ibis offers a familiar but more powerful method for analyzing very large datasets "in-place."   
Ibis supports multiple SQL databases backends, and also supports pandas as a native backend. Combined with Altair, this integration allows you to explore multiple datasets across different data sources.

**Altair**  
Altair is another key component of the OmniSci data science foundation. Building on the same Vega data visualization engine used by Immerse for geospatial charts, Altair provides a pythonic API over Vega-Lite, a subset of the full Vega specification for declarative charting based on the "Grammar of Graphics" paradigm. The OmniSci data science foundation goes further and includes interface code to enable Altair to transparently use Ibis expressions instead of pandas data frames. This allows data visualization over much larger datasets in OmniSci without writing SQL code.





<a id='installation'></a>
## Installation

**Create an environment for the notebooks in this repository**   
If you haven't already, install [Miniconda](https://docs.conda.io/en/latest/miniconda.html) on your machine.  
  
Create the environment for the examples in this repository:  
`conda env create -f environment.yml -n omnisci`   
  
Activate the environment:  
`conda activate omnisci`


**Start local Omnnisci Server**  
***Option 1: Local OmniSci server using docker***  
Using the OmniSci Docker Container for Jupyter tools
You can download a [pre-built Docker container](https://docs.omnisci.com/installation-and-configuration/installation/install-docker) for these tools and simply start up these tools as a standalone container.

In [None]:
# Once you have docker running locally,
# pull the docker image for Open Source CPU installation
! docker pull omnisci/core-os-cpu:v5.2.2
# create a local storage directory
! mkdir ${HOME}/omnisci-storage
# copy the omnisci configuration file into the storage directory
! cp ../omnisci.conf ${HOME}/omnisci-storage
# run docker container
! docker run -d --name omnisci -p 6274:6274 -v ${HOME}/omnisci-storage:/omnisci-storage omnisci/core-os-cpu:v5.2.2

> Note: If you recieve an error message saying this container is already in use, you can remove the old container to start fresh by using   
`docker rm old_container_id`

***Option 2: Local OmniSci server using installer***  
OmnisSci Open Source is available for local install. You can download the executables for your machine on the [OmniSci Website](https://www.omnisci.com/platform/downloads).    
  
Also note that a preview for Mac installs is also [available](https://www.omnisci.com/mac-preview).  
  
***Option 3: Remote Omnisci servers***  
In addition to data you build on a local server, you can always reach out to the sample OmniSci servers to interact with more sample data.  