Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bAbI refactored #97

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions configs/text2text/babi.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
training:
problem:
name: &name BABI
batch_size: &b 1
data_type: train
embedding_type: glove.6B.100d
embedding_size: 50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glove.6B.100d already uses an embedding dimension of site 100, so I do not understand embedding_size: 50

use_mask : false
joint_all: true
one_hot_embedding: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this related to embedding_type: glove.6B.100d& embedding_size: 50?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I just checked. You are right, embedding_type: glove.6B.100d are size 100 , when it is one hot , it size of the dictionary, so I will store size of the dictionary instead

tasks: [1, 2, 3]
ten_thousand_examples: true
truncation_length: 50
directory : ./
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @tkornut indicated, we should change that to data_folder: "~/data/babi/"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok



gradient_clipping: 20

seed_numpy: 847055145
seed_torch: 697881609

optimizer:
name: RMSprop
lr: 0.0001

# settings parameters
terminal_conditions:
loss_stop: 1.0e-2
epoch_limit: 100

validation:
problem:
name: *name
batch_size: *b
data_type: valid
embedding_type: glove.6B.100d
joint_all: true
one_hot_embedding: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark here as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I just checked. You are right, embedding_type: glove.6B.100d are size 100 , when it is one hot , it size of the dictionary, so I will store size of the dictionary instead

tasks: [1, 2, 3]
ten_thousand_examples: true
truncation_length : 50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as @tkornut, is this related to the maximum size of a question? Or is it related to padding?


testing:
problem:
name: *name
batch_size: *b
data_type: test
embedding_type: glove.6B.100d
joint_all: true
one_hot_embedding: true
tasks: [1, 2, 3]
ten_thousand_examples: true
truncation_length : 50

model:
name: LSTM
# Hidden state.
hidden_state_size: 256
num_layers: 1
4 changes: 4 additions & 0 deletions miprometheus/problems/question_context_to_class/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from .babiqa_dataset_single_question import BABI


__all__ = ['BABI']
Loading