Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the creation of LoDTensor more user friendly in book examples #10735

Closed
kexinzhao opened this issue May 17, 2018 · 0 comments · Fixed by #10817
Closed

Make the creation of LoDTensor more user friendly in book examples #10735

kexinzhao opened this issue May 17, 2018 · 0 comments · Fixed by #10817
Assignees

Comments

@kexinzhao
Copy link
Contributor

kexinzhao commented May 17, 2018

Currently, in book examples (understand sentiments, label semantic roles, word2vec, recommender system, rnn_encoder_decoder and machine translation) that uses LoDTensor with LoD info for training or inference, we use similar functions to create LoDTensor from python list or numpy array.

Some examples are as follows:

def create_random_lodtensor(lod, place, low, high):
data = np.random.random_integers(low, high, [lod[-1], 1]).astype("int64")
res = fluid.LoDTensor()
res.set(data, place)
res.set_lod([lod])
return res

def to_lodtensor(data, place):
seq_lens = [len(seq) for seq in data]
cur_len = 0
lod = [cur_len]
for l in seq_lens:
cur_len += l
lod.append(cur_len)
flattened_data = np.concatenate(data, axis=0).astype("int64")
flattened_data = flattened_data.reshape([len(flattened_data), 1])
res = fluid.LoDTensor()
res.set(flattened_data, place)
res.set_lod([lod])
return res

lod = [0, 4, 10]
word = create_random_lodtensor(
lod, place, low=0, high=word_dict_len - 1)

The LoD information used in the book examples is something like [0, 4, 10] based on offset, which is a little bit confusing to users because they are more comfortable with using length instead with something like [[4, 6]].

Although we don't want to change the implementation of LoD using offsets, we want to provide a user friendly wrapper like follows:

lod_tensor = create_lod_tensor(numpy array or tensor,  [[4, 6]], place)

Internally, the length based LoD input will be converted to offset based.

To do list:

  1. create such a utility function
  2. clean up book example code with duplicated lod_tensor creation function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant