Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 馃 hub support #13

Merged
merged 8 commits into from
Feb 8, 2023
Merged

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Jan 31, 2023

This is a very preliminary work to integrate models from the Huggingface Hub (see issue #5).

I tested it only with the google/flan-t5-xl model. It seems to perform well on some classification tasks but definitely needs more evaluation. The good part is that it should be easy to compare different models hosted on huggingface simply by changing the model id.

I'm still not 100% sure about the API design + it needs refinements (docs, examples, tests?). Feedback is welcomed :)

EDIT: I added a notebook guide with explanations on how to use the HubModel class. I didn't want to add too much to the README so I placed it separately. Happy to change it if you prefer it somewhere else.

model = HubModel(model_id_or_url="google/flan-t5-xl")
prompter = Prompter(model)

output = prompter.fit(
    "binary_classification.jinja",
    label_0="positive",
    label_1="negative",
    examples=prompt_examples,
    text_input="I've read many post on here saying Israel gives training to US law enforcement .",
)

Overall I think that for now the prompts are really optimized for OpenAI models. Not really a problem since models on the Hub are very diverse anyway. In any case, there is no open-source competitors to models like ChatGPT (at least for now).

Comment on lines -44 to -49
{
"text": "i have been with petronas for years i feel that petronas has performed well and made a huge profit",
"labels": "negative",
"score": "",
"complexity": ""
},
Copy link
Contributor Author

@Wauplin Wauplin Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a duplicate from the previous example

@Wauplin Wauplin marked this pull request as ready for review February 2, 2023 16:54
@Wauplin Wauplin changed the title [WIP] Add 馃 hub support Add 馃 hub support Feb 2, 2023
@jpoles1
Copy link

jpoles1 commented Feb 5, 2023

Is it possible to use this for NER or "token classification" tasks? Or is it limited to text-generation tasks?

@Wauplin
Copy link
Contributor Author

Wauplin commented Feb 5, 2023

@jpoles1 This PR is only meant to allow everyone to use the promptify library with models shared on the HuggingFace Hub.

As for your question, there is indeed a template to generate a prompt for an NER task (see ner.jinja). In general, the all point of promptify is to use (L)LM to perform any type of NLP task using zero-shot/few-shots learning. So the question "is it possible to perform token-based tokenization ?" with this lib' is yes, as long as the underlying language model you use is capable of it - which is not always the case.

@Wauplin
Copy link
Contributor Author

Wauplin commented Feb 5, 2023

Hope this answers a bit your question. For a more generic question on promptify, I would advice you to open an issue on GitHub or join the discord.

@jpoles1
Copy link

jpoles1 commented Feb 6, 2023

Ah, yep, my question came from a fundamental misunderstanding of what's going on under the hood. Thanks for clarifying @Wauplin.

@eren23
Copy link
Collaborator

eren23 commented Feb 6, 2023

@Wauplin Hey emergency, we are trying to use promptify for text to address extraction of Turkish earthquake tweets, Open AI works well, but we need an alternative just in case, can you join for a discord call to work our way around the HF implementation?

@Wauplin
Copy link
Contributor Author

Wauplin commented Feb 6, 2023

@eren23 Replied to message in mp.

For anyone wanting to test promptify with HF models, you can install the package from my branch:

pip install git+https://github.com/Wauplin/Promptify@add-hf-hub-support

But please be aware that this is still very new and not tested on real cases (i.e. it's not clear which models hosted on the Hub will perform well, on which tasks, on which languages, with which prompts,...). The PR is only enabling the possibility of using Huggingface not guaranteeing the efficiency of the method.

@monk1337 monk1337 merged commit f99c503 into promptslab:main Feb 8, 2023
@monk1337
Copy link
Contributor

monk1337 commented Feb 8, 2023

Thank you @Wauplin, for your great contribution; merging it now.

@Wauplin Wauplin deleted the add-hf-hub-support branch February 8, 2023 13:32
@monk1337 monk1337 added the models issue related to models label Mar 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
models issue related to models
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants