Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

07 label_semantic_roles #548

Merged

Conversation

daming-lu
Copy link
Contributor

Eng & Chi README, train.py all done.

Copy link
Contributor

@jetfuel jetfuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions apply to Chinese, English, Train.py
Otherwise, LGTM


# 标注序列
target = paddle.layer.data(name='target', type=d_type(label_dict_len))
mark_dict_len = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep the comments from the original versions?

-mark_dict_len = 2    # 谓上下文区域标志的维度,是一个0-1 2值特征,因此维度为2
-word_dim = 32        # 词向量维度
-mark_dim = 5         # 谓词上下文区域通过词表被映射为一个实向量,这个是相邻的维度
-hidden_dim = 512     # LSTM隐层向量的维度 : 512 / 4
-depth = 8            # 栈式LSTM的深度

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

def d_type(size):
return paddle.data_type.integer_value_sequence(size)
embedding_name = 'emb'
default_std = 1 / math.sqrt(hidden_dim) / 3.0 # ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove # ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


predicate_embedding = paddle.layer.embedding(
size=word_dim,
# def db_lstm():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have two versions of db_lstm? Are we intend to show the v2 version as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the commented out version

target = paddle.layer.data(name='target', type=d_type(label_dict_len))
crf_cost = paddle.layer.crf(
size=label_dict_len,
feature_out = db_lstm(**locals())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is **locals() ? I don't see this declared elsewhere.

Copy link
Contributor Author

@daming-lu daming-lu Jun 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def just_print(haha, xixi, **whatever):
	print 'haha is :', haha
	print 'xixi is :', xixi

def demo_it():
	haha = 'I am just laughing'
	xixi = 'I am just grinning'
	just_print(**locals())

demo_it()	

Output is

$ python show_dict.py 
haha is : I am just laughing
xixi is : I am just grinning

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when we call it, we pass in locals() which is the local variables in the current scope. ignored can be any word.


# 标注序列
target = paddle.layer.data(name='target', type=d_type(label_dict_len))
mark_dict_len = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep the original comments?

-mark_dict_len = 2    # 谓上下文区域标志的维度,是一个0-1 2值特征,因此维度为2
-word_dim = 32        # 词向量维度
-mark_dim = 5         # 谓词上下文区域通过词表被映射为一个实向量,这个是相邻的维度
-hidden_dim = 512     # LSTM隐层向量的维度 : 512 / 4
-depth = 8            # 栈式LSTM的深度

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

for i in range(1, depth):
mix_hidden = paddle.layer.mixed(
def db_lstm(word, predicate, ctx_n2, ctx_n1, ctx_0, ctx_p1, ctx_p2, mark,
**ignored):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the **ignored do? It is not used anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# Set the threshold low to speed up the CI test
if float(cost) < 60.0:
if save_dirname is not None:
# TODO(liuyiqun): Change the target to crf_decode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove TODO in readme? train.py we can keep it

@daming-lu daming-lu merged commit efac2fa into PaddlePaddle:high-level-api-branch Jun 18, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants