Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate DSPy into the form generation #68

Merged
merged 38 commits into from
Jun 27, 2024
Merged

Integrate DSPy into the form generation #68

merged 38 commits into from
Jun 27, 2024

Conversation

snakedye
Copy link
Contributor

@snakedye snakedye commented Jun 21, 2024

Use DSPy signature to structure the prompt.

  • The prompt is reworked to match the documented extracted keys.
  • Logs the content of the document and the form.

Roadblocks:

  • Support for the GPT4 model in Azure Open AI
  • Error handling for both Document Intelligence and GPT
  • e2e testing wait for the user testing

Close #66
Close #67
Close #69 (since the merge of #70)

* auth.py in ./
* added domain_model.puml of the application
- On /new_label a new label is created in the cache.
- Workflow diagram documenting how the user will proceed.
* analyze is now a POST request
* log the analyzed label
* s/test_document_store/test_label
@snakedye snakedye self-assigned this Jun 21, 2024
snakedye and others added 6 commits June 21, 2024 16:09
@snakedye snakedye marked this pull request as ready for review June 25, 2024 19:54
@snakedye snakedye requested a review from a team as a code owner June 25, 2024 19:54
Copy link
Contributor

@Endlessflow Endlessflow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell this looks good.
A few idea crossed my mind when I was looking at the sanity checks. Right now we are doing the bare minimum but it as the fields we are expecting are almost all strings it would be interesting to see if it is possible/worth it to store the numerical values back in numerical format especially when the units are implicit.
I also noticed some fields expect values in certain formats that we might be able to verify with simple pattern matching (e.g. this field expect a percentage so it needs to have a % sign, etc). Although, I don't know if a pattern matching check failing is ground for us rejecting generations or if it would instead be better to flag such inconsistencies in the front-end for human review and let them decide.

backend/form.py Show resolved Hide resolved
@Endlessflow Endlessflow merged commit 844b7e2 into main Jun 27, 2024
5 checks passed
@Endlessflow Endlessflow deleted the dspy branch June 27, 2024 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants