Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default example if none are provided to COT NER/SpanCat tasks #270

Merged
merged 9 commits into from
Aug 24, 2023

Conversation

kabirkhan
Copy link
Contributor

Description

In order to show the LLM the format we expect, use a default example if none are provided to the SpanTask.

Corresponding documentation PR

Types of change

enhancement

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran all tests in tests and usage_examples/tests, and all new and existing tests passed. This includes
    • all external tests (i. e. pytest ran with --external)
    • all tests requiring a GPU (i. e. pytest ran with --gpu)
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@rmitsch rmitsch added enhancement Improvement of existing feature feat/task Feature: tasks labels Aug 24, 2023
Copy link
Collaborator

@rmitsch rmitsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rmitsch rmitsch marked this pull request as ready for review August 24, 2023 06:44
Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rmitsch rmitsch merged commit 61a6ab7 into kab/cot-ner Aug 24, 2023
11 checks passed
@svlandeg svlandeg deleted the kab/cot-ner-default-example branch August 24, 2023 12:31
rmitsch added a commit that referenced this pull request Aug 25, 2023
* initial POC for Chain of Thought NER task

* ruff fix

* update template

* consilidate approach to work with main SpanTask

* fix tests around label consistency checks

* fix edge cases

* update label consistency checks

* move label consistency checks

* handle labels in span.py

* cleanup older NER

* fixes

* cleanup

* update NER template with label_definitions + initial description, fix parsing of SpanReason

* properly parametrize response parsing for SpanReason test

* start to parametrize NER tests properly with new v3 template

* fix docstring

* rm single_match since it's always true now

* rm single_match since it's always true now

* fix typing of description and default properly for SpanCatTask

* fix test

* fix NER tests

* fir more ner tests + add initial test for SpanReason.from_str

* fix ner to_disk test

* enable adding ner prompt examples from initialize and fix ner_init test

* test fixes

* add yaml/jsonl version of ner examples. Fix inconsistent labels tests

* use yaml/jsonl versions of ner examples

* actually check scoring with real LLM call

* rename format_response to extract_span_reasons

* move Self to compat types

* fix test for serde

* Self only in 3.10+

* Self only in 3.11+ actually

* ner test fixes

* convert spancat to new span task format

* add better doc for SpanReason.to_str

* fixing tests for spancat

* adjust span matching by adding an  setting

* support conditional allow_overlap like standard spancat

* remove dict | operator that only works in python3.9 +

* disable test for now so CI passes

* revert spanreason start_char

* fix spancat template rendering for allow_overlap

* clean up tests for init with spacy examples

* fix spancat test?

* make spancat scoring use external model, not weird dummy data

* run case sensitive matching then fallback to case insensitive if the user configured it that way

* rm prev_span reference in parsing

* separate span parsing for a single doc into its own function

* fix typing on the regression test

* add description field to cfg_keys so it gets serialized

* add old spancat/ner versions to tasks.legacy module

* add deprecation warnings + test deprecation warnings

* update usage examples

* update examples and readme

* fix usage_examples

* rename warning to LLMW001, fix usage_example + readme tests

* remove separate case sensitive match step before doing case insensitive as a fallback

* fix regression test to have 3 ents

* fix incremental parsing

* rm extra docstring stuff

* rm extra test

* consolidate new spans template and ensure valid labels appear in the prompt

* make prompt_examples required since it's required in confection factory func

* fix template rendering tests with new optional description and default content

* Remove extra deprecation warning

* Update usage_examples/ner_v3_openai/README.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Fix 3.6 Protocol import.

* Update ignored warnings.

* Fix filterwarnings.

* Update filterwarnings.

* Renamed examples to prompt_examples.

* Add default example if none are provided to COT NER/SpanCat tasks (#270)

* use a default example for zero-shot NER/spancat COT tasks if none are provided

* add test for no examples spancat COT

* Fix tests.

* Update filterwarnings.

* Update RELExample factory.

* Comment validator.

* Attempt to fix pydantic macOS error.

* Attempt to fix pydantic macOS error.

* Update filterwarnings.

---------

Co-authored-by: Raphael Mitsch <r.mitsch@outlook.com>

* Test Pydantic Mac OS Py 3.8 issue.

* Incorporate feedback. Readd Pydantic REL example workaround.

* Readd NER Dolly usage example, removed TextCat Dolly one.

* Update NER Dolly usage example to use NER.v3.

* Readd NER Dolly test. Revert to NER.v2. Refactor span extraction for sSpan tasks.

* Fix span reason extraction.

* Add working Paris-Paris-Paris example.

* Uncomment example for NER prediction test.

* Fix NER prediction test.

* remove errors class entirely

* Update .github/workflows/test.yml

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy_llm/tests/tasks/test_ner.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Remove overlap part in NER template.

* Remove overlap path in NER and SpanCat templates.

* Update spacy_llm/tasks/spancat/registry.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy_llm/tasks/ner/registry.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Changed SpanCat prompt intro.

* Add docstring info for description.

---------

Co-authored-by: vinit <vinit.ravishankar@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Raphael Mitsch <r.mitsch@outlook.com>
Co-authored-by: svlandeg <svlandeg@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement of existing feature feat/task Feature: tasks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants