Skip to content

Commit

Permalink
docs: updates
Browse files Browse the repository at this point in the history
  • Loading branch information
jarulraj committed Oct 9, 2023
1 parent 5f27824 commit c952858
Show file tree
Hide file tree
Showing 9 changed files with 164 additions and 168 deletions.
16 changes: 7 additions & 9 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,11 @@ parts:
title: Installation Options
- file: source/overview/concepts
title: Concepts
- file: source/overview/model-inference
title: AI Models
sections:
- file: source/overview/concepts/data-sources
title: Data Sources
- file: source/overview/connect-to-database
- file: source/overview/ai-queries
title: AI Queries
- file: source/overview/connect-to-data-sources
title: Connect to Data Sources
#- file: source/overview/faq
- file: source/overview/faq

- caption: Use Cases
chapters:
Expand Down Expand Up @@ -98,8 +95,9 @@ parts:
title: OpenAI
- file: source/reference/ai/yolo
title: YOLO
- file: source/reference/ai/custom
title: Custom Model

- file: source/reference/ai/custom-ai-function
title: Bring Your Own AI Function

- file: source/reference/optimizations
title: Optimizations
Expand Down
136 changes: 136 additions & 0 deletions docs/source/overview/ai-queries.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
.. _ai-queries:

AI Queries
==========

In EvaDB, AI models are simple function calls similar to traditional SQL functions like ``MAX``.

This page details how you can use AI models in different ways to construct AI queries in EvaDB. EvaDB automatically optimizes AI queries to save money and time, as detailed in the :ref:`optimizations<optimizations>` page.

.. note::

EvaDB ships with a wide range of built-in functions listed in the :ref:`models` page. If your desired AI model is not available, you can also bring your own AI function by referrring to the :ref:`custom_ai_function` page.

SELECT Clause
-------------

AI queries often contain the AI function(s) in the ``SELECT`` clause (projection list).

For example, the following query calls the `MnistImageClassifier <https://github.com/georgia-tech-db/evadb/blob/staging/evadb/functions/mnist_image_classifier.py>`_ function to identify digits in a collection of frames in the `mnist_video`.

.. code-block:: sql
SELECT MnistImageClassifier(data).label FROM mnist_video;
WHERE Clause
------------

Another common position in the AI query with model inference is the ``WHERE`` clause (selection).

For example, the following query uses the ``TextSummarizer`` and ``TextClassifier`` functions from the :ref:`HuggingFace<hf>` AI engine to summarize the sentiment of food reviews and identify those expressing a `negative` sentiment in the ``SELECT`` and ``WHERE`` clauses, respectively.

.. code-block:: sql
SELECT TextSummarizer(data)
FROM food_reviews
WHERE TextClassifier(data).label = 'NEGATIVE';
ARRAY Operators
---------------

EvaDB supports specialized array operators.

For example, the following query applies the ``CONTAIN`` operator (``@>``) on the output of an object detection function:

.. code-block:: sql
SELECT id
FROM camera_videos
WHERE ObjectDetector(data).labels @> ['person', 'car'];
Here is another query with the ``UNNEST`` function that flattens the output of an `one-input-to-many-outputs` AI function.

.. code-block:: sql
SELECT UNNEST(FaceDetector(data)) AS Face(bbox, conf)
FROM movie;
The `face detector <https://github.com/georgia-tech-db/evadb/blob/staging/evadb/functions/face_detector.py>`_ model returns multiple outputs (e.g., bounding box and confidence score) as an array. The ``UNNEST`` function unrolls elements from the array into multiple rows.

LATERAL JOIN
------------

For more challenging AI apps, EvaDB supports lateral joins.

The following AI query uses both a ``LATERAL JOIN`` and an ``UNNEST`` function to detect emotions from faces in a movie, where a single scene may contain multiple faces. The output of the `object detector <https://github.com/georgia-tech-db/evadb/blob/staging/evadb/functions/fastrcnn_object_detector.py>`_ is used to crop the bounding box from the image, and the cropped image is then sent to an `emotion detector <https://github.com/georgia-tech-db/evadb/blob/staging/evadb/functions/emotion_detector.py>`_ to detect the emotion of the face inside the bounding box.

.. code-block:: sql
SELECT EmotionDetector(Crop(data, Face.bbox))
FROM movie
LATERAL JOIN UNNEST(FaceDetector(data)) AS Face(bbox, conf);
Ordering
--------

AI models may also be used in the ``ORDER BY`` clause to enable usecases like similarity search.

For example, in the following query, the output of the `SentenceFeatureExtractor <https://github.com/georgia-tech-db/evadb/blob/staging/evadb/functions/sentence_feature_extractor.py>`_
is used to find relevant context for answering the user's question (`When was the NATO created`) from a collection of PDFs.

.. code-block:: sql
SELECT data FROM MyPDFs
ORDER BY Similarity(
SentenceFeatureExtractor('When was the NATO created?'),
SentenceFeatureExtractor(data)
);
Similarity search maps to ordering based on the distance computed by the `Similarity` function, between the features extracted from the query and those extracted from the paragraphs loaded from the documents. EvaDB automatically accelerates such queries using vector databases.

.. note::
Go over the `PrivateGPT <https://github.com/georgia-tech-db/evadb/blob/staging/tutorials/13-privategpt.ipynb>`_ notebook for more details.

Given a queried image, we can use a different feature extractor (`SiftFeatureExtractor <https://github.com/georgia-tech-db/evadb/blob/staging/evadb/functions/sift_feature_extractor.py>`_ function) to find the most similar image from an existing collection of images (`reddit_dataset`).

.. code-block:: sql
SELECT name
FROM reddit_dataset
ORDER BY Similarity(
SiftFeatureExtractor(Open('reddit-images/cat.jpg')),
SiftFeatureExtractor(data)
);
.. note::
Go over the :ref:`Image Search <image-search>` page for more details.


Aggregate Functions
-------------------

AI models can be applied on a sequence of tuples using the ``GROUP BY`` and ``SEGMENT`` clauses.

The following query concatenates consecutive frames in a movie into a single segment and applies an action recognition model on the segment:

.. code-block:: sql
SELECT ASLActionRecognition(SEGMENT(data))
FROM ASL_ACTIONS
SAMPLE 5
GROUP BY '16 frames';
Here is another illustrative query that groups together paragraphs from a PDF document:

.. code-block:: sql
SELECT SEGMENT(data)
FROM MyPDFs
GROUP BY '10 paragraphs';
.. note::

The :ref:`use cases <sentiment-analysis>` illustrate more ways of utilizing AI queries for building AI apps.
20 changes: 0 additions & 20 deletions docs/source/overview/concepts/data-sources.rst

This file was deleted.

49 changes: 16 additions & 33 deletions docs/source/overview/faq.rst
Original file line number Diff line number Diff line change
@@ -1,53 +1,36 @@
:orphan:
Frequently Asked Questions
==========================

===
FAQ
===
.. _faq:

These are some Frequently Asked Questions that we've seen pop up for EvaDB.
Here are some frequently asked questions that we have seen pop up for EvaDB.

If you still have questions after reading this FAQ, ping us on
`our Slack <https://join.slack.com/t/eva-db/shared_invite/zt-1i10zyddy-PlJ4iawLdurDv~aIAq90Dg>`__!
.. note::

Why am I not able to install EvaDB in my Python environment?
============================================================
Have another question or want to give feedback? Ask us on `Slack <https://evadb.ai/community>`__!

Ensure that the local Python version is >= 3.8 and <= 3.10. EvaDB cannot support 3.11 due to its `dependency on Ray <https://github.com/autogluon/autogluon/issues/2687>`__.
Why am I not able to install EvaDB?
-----------------------------------

Where does EvaDB store all the data?
====================================

By default, EvaDB stores all the data in a local folder named ``evadb_data``. Deleting this folder will reset the system's state and lead to data loss.

Why does the EvaDB server not start?
====================================

Check if another process is already running on the target port where EvaDB server is being launched (default port of EvaDB is ``8803``) using these commands:
Ensure that the Python interpreter's version is >= `3.9`.

.. code-block:: bash
.. note::

sudo lsof -i :<port_number>
kill -9 <process_id>
If you are using the `evadb[ray]` installation option, ensure that the Python version is <= `3.10` due to a `Ray issue <https://github.com/autogluon/autogluon/issues/2687>`_. Follow `these instructions <https://github.com/ray-project/ray/issues/33039>`_ to install `ray`.

You can either kill that process or launch EvaDB server on another free port in this way:

.. code-block:: bash
Where does EvaDB store all the data?
------------------------------------

evadb_server -p 9330
By default, EvaDB connects to **existing** data sources like SQL database systems. It stores all the meta-data (i.e., data about data sources) in a local folder named ``evadb_data``. Deleting this folder will reset EvaDB's state and lead to data loss.

Why do I see no output from the server?
=======================================
---------------------------------------

If a query runs a complex vision task (such as object detection) on a long video, the query is expected to take a non-trivial amount of time to finish.
You can check the status of the server by running ``top`` or ``pgrep``:
If a query runs a complex AI task (e.g., sentiment analysis) on a large table, the query is expected to take a non-trivial amount of time to finish. You can check the status of the server by running ``top`` or ``pgrep``:

.. code-block:: bash
top
pgrep evadb_server
pip install ray fails because of grpcio
=======================================

Follow these instructions to install ``ray``:
https://github.com/ray-project/ray/issues/33039
2 changes: 1 addition & 1 deletion docs/source/overview/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Now, activate the virtual environment:

.. code-block:: bash
pip install evadb --upgrade
pip install --upgrade evadb
.. note::

Expand Down
101 changes: 0 additions & 101 deletions docs/source/overview/model-inference.rst

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _udf:
.. _custom_ai_function:

Functions
======================
Bring Your Own AI Function
==========================

This section provides an overview of how you can create and use a custom function in your queries. For example, you could write an function that wraps around your custom PyTorch model.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/usecases/question-answering.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ EvaDB has built-in support for ``ChatGPT`` function from ``OpenAI``. You will ne
.. note::

EvaDB has built-in support for a wide range of :ref:`OpenAI<openai>` models. You can also switch to another large language models that runs locally by defining a :ref:`Custom function<udf>`.
EvaDB has built-in support for a wide range of :ref:`OpenAI<openai>` models. You can also switch to another large language models that runs locally by defining a :ref:`custom AI function<custom_ai_function>`.


ChatGPT function is a wrapper around OpenAI API call. You can also switch to other LLM models that can run locally.
Expand Down

0 comments on commit c952858

Please sign in to comment.