Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Update OpenAI Embedding with latest embedding model #1938

Merged
merged 11 commits into from
Jun 6, 2023

Conversation

dciborow
Copy link
Contributor

@dciborow dciborow commented Apr 25, 2023

Related Issues/PRs

Provide an example for the user of using the latest embedding model if it is available.

What changes are proposed in this pull request?

Briefly describe the changes included in this Pull Request.

How is this patch tested?

  • I have written tests (not required for typo or doc fix) and confirmed the proposed feature/bug-fix/change works.

Does this PR change any dependencies?

  • No. You can skip this section.
  • Yes. Make sure the dependencies are resolved correctly, and list changes here.

Does this PR add a new feature? If so, have you added samples on website?

  • No. You can skip this section.
  • Yes. Make sure you have added samples following below steps.
  1. Find the corresponding markdown file for your new feature in website/docs/documentation folder.
    Make sure you choose the correct class estimators/transformers and namespace.
  2. Follow the pattern in markdown file and add another section for your new API, including pyspark, scala (and .NET potentially) samples.
  3. Make sure the DocTable points to correct API link.
  4. Navigate to website folder, and run yarn run start to make sure the website renders correctly.
  5. Don't forget to add <!--pytest-codeblocks:cont--> before each python code blocks to enable auto-tests for python samples.
  6. Make sure the WebsiteSamplesTests job pass in the pipeline.

@github-actions
Copy link

Hey @dciborow 👋!
Thank you so much for contributing to our repository 🙌.
Someone from SynapseML Team will be reviewing this pull request soon.

We use semantic commit messages to streamline the release process.
Before your pull request can be merged, you should make sure your first commit and PR title start with a semantic prefix.
This helps us to create release messages and credit you for your hard work!

Examples of commit messages with semantic prefixes:

  • fix: Fix LightGBM crashes with empty partitions
  • feat: Make HTTP on Spark back-offs configurable
  • docs: Update Spark Serving usage
  • build: Add codecov support
  • perf: improve LightGBM memory usage
  • refactor: make python code generation rely on classes
  • style: Remove nulls from CNTKModel
  • test: Add test coverage for CNTKModel

To test your commit locally, please follow our guild on building from source.
Check out the developer guide for additional guidance on testing your change.

@dciborow dciborow changed the title Update CognitiveServices - OpenAI Embedding.ipynb docs: Update CognitiveServices - OpenAI Embedding.ipynb Apr 25, 2023
@dciborow dciborow changed the title docs: Update CognitiveServices - OpenAI Embedding.ipynb docs: Update CognitiveServices with latest embedding model Apr 25, 2023
@dciborow dciborow changed the title docs: Update CognitiveServices with latest embedding model docs: Update OpenAI Embedding with latest embedding model Apr 25, 2023
@dciborow dciborow changed the title docs: Update OpenAI Embedding with latest embedding model perf: Update OpenAI Embedding with latest embedding model May 2, 2023
@dciborow dciborow requested a review from mhamilton723 May 2, 2023 20:57
@dciborow
Copy link
Contributor Author

dciborow commented May 3, 2023

@mhamilton723 , i deployed the 'text-embedding-ada-002' endpoint. This should be all ready for testing.

@dciborow
Copy link
Contributor Author

dciborow commented May 3, 2023

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dciborow dciborow closed this May 3, 2023
@dciborow dciborow reopened this May 3, 2023
@codecov-commenter
Copy link

codecov-commenter commented May 3, 2023

Codecov Report

Merging #1938 (63aa883) into master (9911bfb) will decrease coverage by 0.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1938      +/-   ##
==========================================
- Coverage   87.01%   86.98%   -0.04%     
==========================================
  Files         305      305              
  Lines       15993    15993              
  Branches      839      839              
==========================================
- Hits        13917    13911       -6     
- Misses       2076     2082       +6     

see 1 file with indirect coverage changes

@dciborow
Copy link
Contributor Author

dciborow commented May 4, 2023

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary by GPT-4

The changes in the code are as follows:

  1. Added a comment with a link to learn more about selecting which embedding model to choose.
  2. Changed the deployment_name_embeddings variable value from "text-search-ada-doc-001" to "text-embedding-ada-002".
  3. Removed the deployment_name_embeddings_query variable and its value "text-search-ada-query-001".
  4. In the embedding_query code block, replaced .setDeploymentName(deployment_name_embeddings_query) with .setDeploymentName(deployment_name_embeddings).

These changes update the deployment name for the embeddings and remove an unnecessary variable, simplifying the code.

Suggestions

No suggestions are needed as the changes in this PR are clear and straightforward.

@dciborow
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mhamilton723
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mhamilton723 mhamilton723 merged commit fc5d699 into master Jun 6, 2023
@mhamilton723 mhamilton723 deleted the dciborow/openai-nitpick branch June 6, 2023 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants