Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpenAI Provider #35023

Merged
merged 24 commits into from
Nov 7, 2023
Merged

Add OpenAI Provider #35023

merged 24 commits into from
Nov 7, 2023

Conversation

utkarsharma2
Copy link
Contributor

This PR is part of our larger effort to add first-class integrations to support LLMOps that was presented at Airflow Summit.

This PR adds explicitly the OpenAI Provider. OpenAI is a leading American artificial intelligence organization, which offers one of the most used LLM - ChatGPT and offers embedding models.

The primary objective of this Provider is to present users with an alternative embedding model. This allows them to generate vectors for their proprietary data, a pivotal step towards establishing integrations with LLM models like ChatGPT.

Example DAG:
The OpenAIEmbeddingOperator can accept either a string or a callable returning a list of strings.

OpenAIEmbeddingOperator(
        task_id="embedding_using_xcom_data",
        conn_id="openai_default",
        input_text=xcom_text["input_text"],
        model="text-embedding-ada-002",
    )

Email Discussion related to the effort can be found here - https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp

airflow/providers/openai/hooks/openai.py Outdated Show resolved Hide resolved
airflow/providers/openai/operators/openai.py Outdated Show resolved Hide resolved
airflow/providers/openai/operators/openai.py Outdated Show resolved Hide resolved
airflow/providers/openai/hooks/openai.py Outdated Show resolved Hide resolved
@utkarsharma2 utkarsharma2 marked this pull request as ready for review October 26, 2023 13:50
CONTRIBUTING.rst Outdated Show resolved Hide resolved
INSTALL Outdated Show resolved Hide resolved
Copy link
Member

@pankajastro pankajastro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

airflow/providers/openai/CHANGELOG.rst Outdated Show resolved Hide resolved
airflow/providers/openai/operators/openai.py Outdated Show resolved Hide resolved
airflow/providers/openai/operators/openai.py Outdated Show resolved Hide resolved
airflow/providers/openai/provider.yaml Outdated Show resolved Hide resolved
tests/providers/openai/hooks/test_openai.py Outdated Show resolved Hide resolved
@pankajastro pankajastro merged commit cca4aa4 into apache:main Nov 7, 2023
71 checks passed
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Nov 10, 2023
* Add OpenAI Provider

* Apply suggestions from code review

Co-authored-by: Phani Kumar <94376113+phanikumv@users.noreply.github.com>

* Remove create_completions method from hook

* Change type of input_text param

Since the upstream API accepts str ot list of tokens, we accept the similar inputs from user.

* Updated min-airflow version to 2.5.0

* Updated the interface and fix docs and static files

* Fix tests

* Fix tests

* Change the version

Because of OpenAI SDK not being production ready

* Add embedding_kwargs as a param to operator

* Update tests/providers/openai/hooks/test_openai.py

Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>

* Remove unwanted params in docstring

* Update Changelog

* Add security.rst file

* Update docs/apache-airflow-providers-openai/index.rst

Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>

* Add host field for connections

* Update docs/apache-airflow-providers-openai/index.rst

Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>

* Add changelog.rst file to docs

* Change version to 1.0.0

* Resolve conflicts

* Fix tests

* Fixed tests

* Fix test

* Resolve Conflict

---------

Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Co-authored-by: Phani Kumar <94376113+phanikumv@users.noreply.github.com>
Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>
@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-tools area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) kind:documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet