Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset from dicts #127

Merged
merged 2 commits into from
Nov 7, 2019
Merged

Dataset from dicts #127

merged 2 commits into from
Nov 7, 2019

Conversation

tholor
Copy link
Member

@tholor tholor commented Oct 25, 2019

As requested in #85, we can add an option to let the DataSilo load data from dicts instead of automatic loading from files.

Exemplary Usage:

processor = TextClassificationProcessor(tokenizer=tokenizer,
                                        max_seq_len=8,
                                        data_dir=None,
                                        train_filename=None,
                                        label_list=["OTHER", "OFFENSE"],
                                        metric="f1_macro",
                                        dev_filename=None,
                                        test_filename=None,
                                        dev_split=0.0,
                                        label_column_name=None)

data_silo = DataSilo(
    processor=processor,
    batch_size=batch_size,
    automatic_loading=False)

basic_texts = [
    {"text": "Martin Müller spielt Handball in Berlin.", "text_classification_label": "OTHER"},
    {"text": "Schartau sagte dem Tagesspiegel, dass Fischer ein Idiot sei.", "text_classification_label": "OFFENSE"}
]
data_silo._load_data(train_dicts=basic_texts)

@tholor tholor requested a review from tanaysoni November 7, 2019 08:26
@tholor tholor added enhancement New feature or request part: processor Processor labels Nov 7, 2019
@tholor tholor self-assigned this Nov 7, 2019
@tholor tholor changed the title WIP Dataset from dicts Dataset from dicts Nov 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request part: processor Processor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants