Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix misc errors in tutorials 1 and 3 #549

Merged
merged 4 commits into from
Mar 31, 2024
Merged

Fix misc errors in tutorials 1 and 3 #549

merged 4 commits into from
Mar 31, 2024

Conversation

gao-hongnan
Copy link
Contributor

@gao-hongnan gao-hongnan commented Mar 31, 2024

Applying Structured Output to RAG applications

While following the tutorial, named Applying Structured Output to RAG applications, I noticed some cells are not rendered correctly.

Some of the fixes involve the following (not exhaustive):

  1. One block of code has the response_model to be s, where s is undefined, inferring from earlier content, I tentatively replaced s to be Iterable[Extraction].

     ```python
     extractions = client.chat.completions.create(
         model="gpt-4-1106-preview",
         stream=True,
         response_model=Iterable[Extraction], # original is `response_model=s`
         messages=[
             {
                 "role": "system",
                 "content": "Your role is to extract chunks from the following and create a set of topics.",
             },
             {"role": "user", "content": text_chunk},
         ],
     )
     ```
    
  2. The helpers script/module is not defined, I did a search, which lead me to https://github.com/wandb/edu/blob/main/llm-structured-extraction/helpers.py and added the missing function(s).

    ```python
    from helpers import dicts_to_df
    ```
    
    leading to `ModuleNotFoundError: No module named 'helpers'` and halting the program at the next cell.
    
  3. Small linguistic fix:

     ```text
     One of the big limitations is that often times the query we embed and the text A common method ...
     ```
     
     is changed to
     
     ```text
     Example 1) Improving Extractions
     One of the big limitations is that often times the query we embed and the text we are searching for may not have a direct match, leading to suboptimal results. A common method
     ```
    

Working with structured outputs

While going through the first tutorial, I noticed many cells will render error (by design) to show readers how validations work. It might be cleaner to use try-except with traceback here so in future, if we ever use execute=True or something, the notebook will render and not halt. An example below:

for obj in data:
    name = obj.get("first_name")
    age = obj.get("age")
    try:
        age_next_year = age + 1
        print(f"Next year {name} will be {age_next_year} years old")
    except TypeError:
        traceback.print_exc()

Copy link
Contributor

ellipsis-dev bot commented Mar 31, 2024

Skipped PR review on 7b774fe because no changed files had a supported extension. If you think this was in error, please contact us and we'll fix it right away.


Generated with ❤️ by ellipsis.dev

@jxnl jxnl merged commit ce579eb into jxnl:main Mar 31, 2024
3 of 7 checks passed
PrathamSoni pushed a commit to EndexAI/instructor that referenced this pull request Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants