Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add formatting_fn, fallback mechanism, Prompt class, and some more #33

Merged
merged 11 commits into from Oct 31, 2023

Conversation

alvarobartt
Copy link
Member

@alvarobartt alvarobartt commented Oct 31, 2023

Description

This PR tackles both #30 and #26, as well as some other issues that have been identified while developing those.

A Prompt (a pydantic.BaseModel) has been defined to track the system_prompt and the formatted_prompt internally, as well as method named format_as that adapts those to existing known formats i.e. OpenAI and Llama2 for the moment (to be extended in an upcoming PR).

Besides that, the _parse_output method has been removed in favour of parse_output, where a fallback mechanism is implemented so that the raw responses/generations are returned too in case the parse_output method fails.

On top of that, since by default the generate_prompt method is going to return a Prompt class instead of an already formatted prompt for a given constraint model, we can potentially define a formatting_fn that will format that Prompt with the format that the model is expecting, if not, either the current format or the default format_as will be used.

A fully working script has been uploaded to examples/pipeline-to-argilla.py.

@alvarobartt alvarobartt marked this pull request as ready for review October 31, 2023 14:16
@alvarobartt alvarobartt merged commit 9b1cf07 into dev Oct 31, 2023
@alvarobartt alvarobartt deleted the formatting-fn-and-raw-response-fallback branch October 31, 2023 15:03
gabrielmbmb added a commit that referenced this pull request Nov 23, 2023
* Add `prompts` with `Llama2Prompt`

* Add `HuggingFaceLLM` and `Llama2CppLLM`

* Fix some type-hints

* Add `_argilla_installed` to check if `argilla` is installed

* Add `chat_format` and `rank_format`

* Add `LlamaCppLLM.{as_generator,as_ranker}`

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Use `torch>=2.0`

* Add package configuration files

* Add `LLM` base class

* Add `LlamaCppLLM` configuration params

* Rename to `dataset.py`

* Refactor `Prompt` in favour of `PromptTemplate`

Still under active development

Co-authored-by: Gabriel Martín <gabriel@argilla.io>

* Refactor `LLM` base class

Still under active development

Co-authored-by: Gabriel Martín <gabriel@argilla.io>

* Migrate from `dataclass` to `pydantic.BaseModel`

* Move `ranking.py` to `response_ranking.py`

* Update and fix `GPT4ResponseRankingPrompt`

* Rename `ranking.jinja2` to `response-ranking.jinja2`

* Clean imports in `__init__`

* Fix `generate` in `OpenAILLM`

* Add `LlamaCppLLM` with `PromptTemplate`

* Refactor `LLM` base class

Still under active development

Co-authored-by: Gabriel Martín <gabriel@argilla.io>

* Add `num_generations` argument

* Align `OpenAILLM` with `LlamaCppLLM`

* Rename `prompts` to `inputs`

* Move `jinja2` files to `templates/*.jinja2`

* Refactor `ResponseRankingPromptTemplate` into `OpenAIResponseRanking`

* Remove `rating_model.py`

* Add basic generation pipeline

Co-authored-by: alvarobartt <alvaro@argilla.io>

* Dataset generation from prompt templates

Co-authored-by: alvarobartt <alvaro@argilla.io>

* Add `example/use-custom-template.py`

* Add `openai` dependencies

* Add handling OpenAI API exceptions

* Add `TransformersLLM`

* Add `InferenceEndpointsLLM` and `examples/inference-endpoints-llm.py` (#11)

* Move `TransformersLLM` to `huggingface.py` and add `InferenceEndpointsLLM`

* Fix `InferenceEndpointsLLM.generate`

* Add `examples/inference-endpoints-llm.py`

* Add handling futures (#10)

* Add handling futures for `labelling_llm`

* Update combining generation results at pipeline level

* Add prompt template for text generation with OpenAI

* Update pipeline to be able to process futures from `generation_llm`

* Refactor `LLM`, adapt `_generate` methods, and `OpenAILLM` validation (#13)

* Add `concurrent.futures` and `tenacity` to `InferenceEndpointsLLM`

* Move `concurrent.futures` to `LLM`

* Add `max_new_tokens`, `temperature` and `num_threads` to `TransformersLLM` and `InferenceEndpointsLLM`

* Adapt `_generate` method in `LLM` subclasses

* Add `OpenAILLM` model validation via `available_models` property

* Fix `openai.api_key` to be set before `available_models`

* Update `examples/use-custom-template.py` output

* Update `Pipeline` to make LLMs arguments optional (#12)

* Update `Pipeline` to make LLMs arguments optional

* Add checking that at least one LLM has been provided

* Remove unused import

* Add handling parse output step exceptions (#14)

* Add `template` cached property (#15)

* Add function to create func for progress bars (#17)

* Add missing dependencies (#18)

* Add `argilla` integration on `CustomDataset` (#16)

* Move `PreferenceDataset` to `legacy_dataset.py`

* Add `CustomDataset` extending from `datasets.Dataset` (WIP)

* Add `_remap_dataset` in `Pipeline` to use `CustomDataset`

* Use `rating` instead of `score`

* Set `num_generations=1` in `labelling_llm.generate`

As well as adding some context on why is it needed, and included there as a safe mechanism, since relying on the default value for `num_generations` in `generate` may not be the most suitable

* Add `argilla_output_args` in `generate` for `to_argilla` (WIP)

* Add `to_argilla_{fields,questions,record}` methods in `OpenAIResponseRanking` (WIP)

* Fix `OpenAITextGenerationPromptTemplate.parse_output` return type-hint

* Define custom `questions` for `OpenAIResponseRanking`

* Add more readable `title` for `fields` and `questions`

* Add `ArgillaTemplate` with `abstractmethod`s

* Fix `examples/*.py` to implement `_parse_output` instead

* Rename `Ranking` (and similar) to `Rating`

* Fix `OpenAIResponseRating`

* `CustomDataset` to expect `prompt_template` for `to_argilla`

* Use `List` instead of `list` on type-hints

* Fix `cached_property` cannot be pickled by thread lock

The exception raised when using `cached_property` and trying to `from rlxf.prompts.base import PromptTemplate` was `TypeError: cannot pickle '_thread.RLock' object`

* Fix typo in `labelling_llm` in `Pipeline`

* Fix `response` alingment starting index and strip `str` values

* Rename `title` for both `fields` and `questions`

Now is `Response N` as the `OpenAI` text was not giving any useful information for the human labeller, and `Response N` is clearer

* Delete `legacy_dataset.py` with `PreferenceDataset`

* Refactor `rlxf` to `ultralabel` (#19)

Co-authored-by: Daniel Vila <daniel@argilla.io>
Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Add `examples/use-custom-template-for-{dpo,text-classification}.py` (#21)

* Add `pipeline` function (#20)

* Fix `to_argilla` return type-hint

* Add `pipeline` function and `Generic[TypeVar]` to `Pipeline`

* Extend `ArgillaTemplate` to include `*args` and `**kwargs`

* Add `group_ratings_as_ranking` arg in `OpenAIResponseRating`

* Add `PreferenceDataset` including `group_ratings_as_ranking` arg

* Update and set defaults in `OpenAIResponseRating`

Co-authored-by: Daniel Vila <daniel@argilla.io>

* Fix `dataset_cls` type-hint

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

---------

Co-authored-by: Daniel Vila <daniel@argilla.io>
Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Align default `temperature` and check `use_default_system_prompt` in `TransformersLLM` (#27)

* Align default `temperature` to 0.7

* Add `warning` if tokenizer has `use_default_system_prompt=True`

* Rename `PromptTemplate` to `Task`, and remove `ArgillaTemplate` (#28)

* Rename `prompts` to `tasks`

* Rename `PromptTemplate` to `Task`

* Remove `ArgillaTemplate` in favour of `Argilla` included in `Task`

* Move `llama.py` and `openai_.py` under `tasks/text_generation`

* Add missing `task_description` in `Task` class

* Add `ChatCompletion` in `tasks/utils.py`

* Add `ultrafeedback.jinja2` and `tasks/preference/ultrafeedback.py`

* Add empty `tasks/critique` module

* Add `MultiRatingTask` detached from `OpenAILLM` (#29)

* Move `llama.py` and `openai_.py` under `tasks/text_generation`

* Add missing `task_description` in `Task` class

* Add `ChatCompletion` in `tasks/utils.py`

* Add `ultrafeedback.jinja2` and `tasks/preference/ultrafeedback.py`

* Add empty `tasks/critique` module

* Move `templates/*.jinja2` to `_templates/*.jinja2`

* Add missing `system_prompt` in `MultiRatingTask`

* Fix typo in `to_argilla_questions` (`Whats's` to `What's`)

* Add `examples/pipeline-to-argilla.py`

* Fix `ultrafeedback.jinja2` response formatting

Closes #31

* Add `formatting_fn`, fallback mechanism, `Prompt` class, and some more (#33)

* Remove unused template `gpt-text-generation.jinja2`

* Add `Prompt` with `{system,formatted}_prompt` and `format_as` method

* Remove `_parse_output` in favour of `parse_output` and add `Prompt` to type-hint

* `OpenAITextGenerationTask.generate_prompt` to use `Prompt.format_as("openai")`

* Remame `_parse_output` to `parse_output`

* Add `formatting_fn` and fallback mechanism to return raw responses

* Update `examples/pipeline-to-argilla.py`

* Return `Prompt` in `generate_prompt` method in `MultiRatingTask`

* Add `raw_{generation,labelling}_response` column to `datasets.Dataset`

* Fix `raw_labelling_response` column addition to `datasets.Dataset`

* Add `logging` to `tenacity.retry` in `OpenAILLM`

* Fix `ratings` and `ratings_description` in `ultrafeedback.jinja2` (#37)

* Fix task description and remove ratings_description to simplify usage (#40)

* format

* Add linebreak

* Rename `generation_llm->generator` and `labelling_llm->labeller`, and add `subtask` arg to `pipeline` (#41)

* Add `__subtasks__` in `MultiRatingTask`

* Rename `generation_llm->generator` and `labelling_llm->labeller`

* Add `subtask` arg in `pipeline` and handle `labeller` if provided

* Add `SelfInstructTask` (#42)

* Add `JudgeLM`, fix label-formatting in `pipeline`, rename to `UltraFeedbackTask`, and more (#48)

* Add `logging` to `tenacity.retry` in `InferenceEndpointsLLM`

* Delete unused template `gpt4-response-rating.jinja2`

Removed in favour of `ultrafeedback.jinja2`

* Add `judgelm.jinja2` template

* Fix typo `prompt template` -> `task`

* Remove not required `argilla_{fields,questions}_typedargs` variables

* Add `JudgeLMTask`

* Rename `MultiRatingTask->UltraFeedbackTask` and minor improvements

* Fix label-formatting in `Pipeline.generate`

* Update `examples/*.py` and remove outdated ones

* Add some TODOs for upcoming iterations

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

---------

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Update `LICENSE` (#49)

* Add `LICENSE` headers to Python files

* Update `LICENSE` to Argilla, Inc.

* Renaming (#50)

* formatted

* format

* Fixes local var undeclared (#60)

* formatted

* format

* Fixes local var error when no labeler

* Adds Zephyr template (#61)

* Adds Zephyr template turn0 template

* Define `LLMOutput` type for `generate` and store `prompts` in `datasets.Dataset` (#51)

* Add `LLMOutput` as `TypedDict`

* Use `LLMOutput` as return type for `LLM._generate` and subclasses

* Fix `to_dict_recursive` call separated

* Fix `raw_generation_response` removal before `labelling`

* Add `LLMOutput` and improve `batch_{generations,labels}` processing

* Remove non-required empty dict when processing labels

* Rename `raw->raw_output` and `parsed->parsed_output` in `LLMOutput`

* Add `prompt_used` in `LLMOutput`

* Add `{generation,labelling}_prompt` columns

* Add `_reset_dataset` to avoid dict conversion/loading

* Add `warnings.warn` if `num_generations<2` for `generator` when `labeller` is not None

* Add missing `Future` return type-hint in `LLM.generate`

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Fix wrong return type-hint in `_get_batch_generations`

---------

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Fix `LLMOutput.parsed_output` to be `None` if `parse_output` fails (#63)

* Make `LLMOutput.parsed_output` optional as it can be `None`

* Set `LLMOutput.parsed_output` to `None` if `parse_output` fails

* TODO: Add `do_sample=False` if `temperature=0.0` for greedy decoding

This won't be tackled here because it should be tackled in a separate PR as part of #39

* Add basic logger (#56)

* Add basic logger

* Add `DISTILABEL_LOG_LEVEL` environment variable

* Add license header

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

* Move aux function to `Pipeline` class

Co-authored-by: alvarobartt <alvaro@argilla.io>

---------

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

* Move inference endpoints (#65)

* Restructure Hugging Face LLMs

* Fix terminal cursor dissapears after executing pipeline

* Fix parsing rationale scores (#67)

* Add more generation parameters for each LLM (#68)

* Add `__generation_attrs` to provide only `not None` kwargs (#71)

* Add `enable_checkpoint` in `Pipeline.generate` (#69)

* Fix `LLM.generate` return type-hint

* Remove duplicated `_include_generator_outputs_as_inputs`

* Improve `_validate_dataset` exception messages

* Remove `_reset_dataset` in favour of `_reset_and_remap_dataset`

* Add `_overlapping_columns_with_dataset` method and `force` arg in `generate`

* Add `start=1` in `enumerate` over `dataset.iter()`

* Add `checkpointing` arg in `Pipeline.generate`

* Add missing `parsed_output is not None` check

Closes #57

* Curate checkpointing strategy (WIP)

* Add `task` to backup dataset when `checkpointing=True` (WIP)

* Use `*args` and `**kwargs` in `CustomDataset` and `Argilla`

* Add `_overlapping_columns_with_dataset` as part of `_validate_dataset`

* Remove `_reset_and_remap_dataset` method

* Add `_build_dataset` method and rename `checkpointing->enable_checkpoints`

* Ensure `enable_checkpoint` is able to `_build_dataset`

* Add `examples/zephyr-judgelm.py`

Co-authored-by: Daniel Vila <daniel@argilla.io>

* Ensure all required columns exist on `_build_dataset`

* Fix `examples/zephyr-judgelm.py` import

* Fix typo `frequency_penalty->frequence_penalty`

---------

Co-authored-by: Daniel Vila <daniel@argilla.io>

* Add retry on `ConnectionError` (#76)

* Add `chatML` format (#72)

* Add `chatML` format in `Prompt.format_as`

See https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chatgpt?tabs=python&pivots=programming-language-chat-ml#working-with-chat-markup-language-chatml

* Rename `utils.py` to `prompt.py`

* Remove `pydantic` dependency (#79)

* Fix `examples/*.py` (#73)

* Add `vLLM` (#83)

* Add `vLLM` class

* Update `vLLM` implementation for batching

* Add `_generate_prompts` method

* Update `LlamaCppLLM` for batching

* Refactor pipeline and LLMs for batching

* Update to `openai>=1.0.0` (#88)

* Add `prompt_format` and `prompt_formatting_fn` args in `LLM` and update `_generate_prompts` method (#85)

* Add `TextGenerationTask`

* Use `TextGenerationTask` inheritance and split `SelfInstructTask`

* Remove unused `distilabel/tasks/critique` module

* Remove unused templates and rely on `format_as` in `Llama2TextGenerationTask`

* Add `zephyr` in `format_as` and include `chatml` in `Literal` type-hint

* Add `prompt_format` and `prompt_formatting_fn`, and use in `_generate`

* Add `SupportedFormat` type with `Prompt.format_as` values

* Fix `JudgeLMTask` to join `rationale` using line-breaks

* Add `default` within `format_as` instead

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Fix import under `TYPE_CHECKING` from `distilable` to `distilabel`

---------

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Add `model_name` in `LLM`s and as column in `datasets.Dataset` (#84)

* Add `model_name` property to `LLM` subclasses

* Use `get_inference_endpoint` instead of `InferenceClient`

* Add `model_name` to `vLLM`

* Update and fix `examples/*.py`

* Remove unused `requirements.txt` file

* Add `huggingface_hub >= 1.19.0` dependency

* Add `model_name` to `LLMOutput`

* Add `generation_model` and `labelling_model` columns in `Pipeline.generate`

* Revert merge code-deletion of `LLM._generate_prompts`

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Add note on why `JudgeLM` and `ULTRAFEEDBACK` ratings are `int`

* Add `wait` with `timeout=30` to `InferenceEndpoint`

* Fix `_build_dataset` to build dataset from `dataset.to_dict()`

For context, using `dataset.data` will return the `ArrowTable`, but if any pre-processing is applied before, then the `data` won't return that, but the actual data stored locally in the Arrow files, which will lead to missmatches between the data used within `Pipeline.generate` and the actuald data in `Dataset.data`

---------

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Use `flatten_indices` over `to_dict` + `from_dict`

See huggingface/datasets#6413

* Add `release` workflow (#70)

* Update to use `ruff format` instead of `black` (faster)

* Add `release` workflow

* Adds `principles` to `TextGenerationTask` (#87)

* Add principles for text generation

* Move `principles` logic to base `TextGenerationTask`

* Update if condition

* Change name to `UltraFeedbackPrinciples`

---------

Co-authored-by: gabrielmbmb <gmartinbdev@gmail.com>

* Update labeller exception handling (#90)

* Update exception handling for `future.result`

* Add `model_name`

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

* Add adding missing labels back

Co-authored-by: alvarobartt <alvaro@argilla.io>

* Remove TODO

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

---------

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

* Improve progress bar (#92)

* Add `use_progress_bar` decorator

* Add `Pipeline` with default generic parameter

* Update type hint

* Fix advance value for progress bar

* Adds basic readme (#93)

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Split Hugging Face LLMs dependencies (#94)

* Split HF dependencies in two extras

* Add `torch` dependency to `hf-transformers`

* Bump version to `0.1.0rc0`

* Add `rich` dependency

* Bump version to `0.1.0rc1`

* Update pyproject.toml (#96)

* Add `UltraJudgeTask` (#99)

* Add `UltraJudgeTask`

* Update regexes to capture decimal numbers

* Fix `output_args_names`

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

---------

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

* Fix `to_argilla` for JudgeLM when output args are `None` (#97)

* Fix `to_argilla` for JudgeLM when output args are `None`

* Fix `ultrafeedback` rating suggestion with value==0

* Add warning when `rating==0`

Co-authored-by: alvarobartt <alvaro@argilla.io>

---------

Co-authored-by: alvarobartt <alvaro@argilla.io>

* Fix `pipeline` task kwargs handling & ignore extra columns in `dataset` (#101)

* Allow extra columns within provided `dataset` to `Pipeline.generate`

* Fix `pipeline` handling of `UltraFeedbackTask` kwargs

* Generalize Preference Task and unify to_argilla methods (#100)

* Adds metadata to_argilla for judgeLM

* several fixes and example

* Generalize and refactor preference task to argilla

* More restructuring

* remove rankings

* refactor

* Moving methods to argilla, add test of three preference tasks

* fix issue when ratings are none

* typing

* move check arg exists to base

* move common args to pref base

* remove unused import

* renaming create_fields

* remove silly method

* Remove unused `=` file

Probably came from a `pip install something>=` not properly closed

* Fix and rename `basic-judgelm->label-dataset-using-judgelm`

---------

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

* Add `docstrings`, fix code, and re-run quality checks (#104)

* Remove unused `docs/rlxf.png` image

* Add `LICENSE` headers to `tests`

* Add missing `LICENSE` header in `principles.py`

* Clean and fix `tasks/preference/*.py`

- Add missing LICENSE headers
- Re-run `ruff` and `black`
- Fix code in `metadata_properties` (due to `arg_name` being unbound)
- `ratings` to be `List[float]` as `int` cast is being done within Argilla only

* Add `# type: ignore` in `vllm.py`

* Rename `vllm->vllm_` and add disclaimer on naming

* Add disclaimer on `openai_` file name

* Re-run `ruff --fix` and `ruff format` over `src`

* Add docstrings for `LLM` and subclasses

* Remove `PreferenceDataset` and fix side-effects

* Add `combine_dicts` and `CustomDataset` docstrings

* Add `Pipeline` docstrings

And also removed unused `task=critique` in `pipeline` arg, and renamed `input` to `input_` variable to avoid overwriting Python's default `input` variable

* Remove `kwargs` from `CustomDataset` methods

* Add docstrings to `Task` and subclasses

* Add `__repr__` and `__repr_rich__` magic methods (#105)

* Add `__rich_repr__` method

* Add basic `__repr__`

* Add `verbose` parameter (#106)

* Add imports in `tasks` and `llm` subpackages (#107)

* Add imports in `tasks` and `llm` subpackages

* Fix `argilla` import errors when not installed

* Fix import errors when `hf-transformers` extra not installed

* Move to subpackage `utils`

* Add `tenacity` as non-optional dep

* Add `_ARGILLA_AVAILABLE`

* Remove `LLM` import

* Update `examples/*.py` and re-run before release (#108)

* Add `inference-endpoints-llm.py`

Add example for question-answering using `InferenceEndpointsLLM` and a custom `Task`

* Add `examples/pipeline-fn-ultrafeedback.py`

* Add `examples/pipeline-fn-ultrafeedback-labeller.py`

* Rename to `inference-endpoints-llm-custom-task.py`

* Update `examples/*.py`

* Update `examples/pipeline-vllm-and-openai.py`

* Add examples using `llama_cpp` and `transformers`

* Fix `openai` import and add `UltraJudge` (#111)

* Add missing `dataclass` to `SelfInstructTask`, include within `__init__`, and add an example (#109)

* Rename self_instruct_.py to self_instruct.py

* Add SelfInstructTask to init

* Fix dataclass issues selfinstruct

* Add simple example instruction gen

* Improve example

* Update examples/pipeline-selfinstruct-math-openai.py

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

* Remove warning about naming

---------

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

* Add `_check_package_is_installed` and fix previous issues (#112)

* Update description to include `AIF` acronym

* Align dependencies formatting and include `llama_cpp`

* Add missing LICENSE header

We should include the LICENSE header injection within the pre-commit

* Add `_check_package_is_available` with version-check and overall improvements

* Add missing `version` parsing in `_check_package_is_available`

* Add docstrings to `_check_package_is_installed`

* Use `importlib` over `pkg_resources` and remove `__name__ == "__main__"`

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

---------

Co-authored-by: Gabriel Martin <gabriel@argilla.io>

* Fix `min_version` of `huggingface_hub` in `_check_package_is_available`

* Remove `warnings` from `_check_package_is_available` to reduce verbosity

* Update type hint to `ClassVar` (#113)

* Add basic `docs` with `mkdocs` and `mkdocs-material` (#80)

* Initial docs

* Add docs dependencies

* Update docs

* Move concepts snippets

* Add script reference

* Add quick example

* Add `python` extra to `mkdocstrings`

* Remove `mike` dep (at least for now)

* Update snippets and images

* Update guides page

* Remove about page

* Add favicon

---------

Co-authored-by: Gabriel Martín Blázquez <gmartinbdev@gmail.com>

* Fix `None` issues for `to_argilla` method (#114)

* Rename self_instruct_.py to self_instruct.py

* Add SelfInstructTask to init

* Fix dataclass issues selfinstruct

* Add simple example instruction gen

* Improve example

* Fix issue with none rationale

* Fix issues with Nones for to_argilla

* Update path of images

* Bump version to `0.1.0rc2`

* Add readme (#115)

* Include basic readme

---------

Co-authored-by: Gabriel Martin <gabriel@argilla.io>
Co-authored-by: Gabriel Martín Blázquez <gmartinbdev@gmail.com>
Co-authored-by: Daniel Vila <daniel@argilla.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant