Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train_dsin error #8

Closed
blldd opened this issue Sep 8, 2019 · 14 comments
Closed

train_dsin error #8

blldd opened this issue Sep 8, 2019 · 14 comments

Comments

@blldd
Copy link

blldd commented Sep 8, 2019

Hi, I have got an error while run train_dsin.py, the info as follows:

Caused by op 'sparse_emb_14-brand/Gather_6', defined at:
File "train_dsin.py", line 52, in
att_embedding_size=1, bias_encoding=False)
File "/home/dedong/pycharmProjects/Emb4RS/models/DSIN/code/_models/dsin.py", line 85, in DSIN
sess_feature_list, sess_max_count, bias_encoding=bias_encoding)
File "/home/dedong/pycharmProjects/Emb4RS/models/DSIN/code/_models/dsin.py", line 154, in sess_interest_division
sparse_fg_list, sess_feture_list, sess_feture_list)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/deepctr/input_embedding.py", line 145, in get_embedding_vec_list
embedding_vec_list.append(embedding_dictfeat_name)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/engine/topology.py", line 252, in call
output = super(Layer, self).call(inputs, **kwargs)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 575, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/layers/embeddings.py", line 158, in call
out = K.gather(self.embeddings, inputs)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/backend.py", line 1351, in gather
return array_ops.gather(reference, indices)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 2486, in gather
params, indices, validate_indices=validate_indices, name=name)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1834, in gather
validate_indices=validate_indices, name=name)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): indices[0,0] = 136739 is not in [0, 79963)
[[Node: sparse_emb_14-brand/Gather_6 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](sparse_emb_14-brand/embeddings/read, sparse_emb_14-brand/Cast_6)]]

do you know how to fix this, thanks!

@shenweichen
Copy link
Owner

please update your code to the latest and run them on the environment written on
https://github.com/shenweichen/DSIN#operating-environment

@blldd
Copy link
Author

blldd commented Sep 8, 2019

I run this code on tf-cpu1.4.0, cause my cuda is 10.0 and cannot run on gpu.
Do you know what does this error info mean?

@shenweichen
Copy link
Owner

have you run your code on python3.6?

@blldd
Copy link
Author

blldd commented Sep 8, 2019

right

@shenweichen
Copy link
Owner

check your code is up to date with the the latest commit

@blldd
Copy link
Author

blldd commented Sep 8, 2019

It is the latest commit with deepctr==0.4.1

@shenweichen
Copy link
Owner

shenweichen commented Sep 8, 2019

yes i suggest you to clone the whole repo and re-run again

@blldd
Copy link
Author

blldd commented Sep 8, 2019 via email

@blldd
Copy link
Author

blldd commented Oct 9, 2019

Hi, my friend, thank you for your work, I try to debug and find a tiny bug:

in file - 0_gen_sampled_data.py:

unique_cate_id = np.concatenate(
    (ad['cate_id'].unique(), log['cate'].unique()))

lbe.fit(unique_cate_id)

in file - 2_gen_dsin_input.py:

data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')

here merge method lost some data(cate_id, and brand)

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

so here data['brand'].nunique() is small than input data index.

and I log all unique input brand number, and update the fd, then code can run without error.

@MrDadiao
Copy link

Hi, my friend, thank you for your work, I try to debug and find a tiny bug:

in file - 0_gen_sampled_data.py:

unique_cate_id = np.concatenate(
    (ad['cate_id'].unique(), log['cate'].unique()))

lbe.fit(unique_cate_id)

in file - 2_gen_dsin_input.py:

data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')

here merge method lost some data(cate_id, and brand)

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

so here data['brand'].nunique() is small than input data index.

and I log all unique input brand number, and update the fd, then code can run without error.

I have also encountered this problem. Could you please tell me how to modify this bug in detail?

@jellchou
Copy link

Hi, my friend, thank you for your work, I try to debug and find a tiny bug:

in file - 0_gen_sampled_data.py:

unique_cate_id = np.concatenate(
    (ad['cate_id'].unique(), log['cate'].unique()))

lbe.fit(unique_cate_id)

in file - 2_gen_dsin_input.py:

data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')

here merge method lost some data(cate_id, and brand)

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

so here data['brand'].nunique() is small than input data index.

and I log all unique input brand number, and update the fd, then code can run without error.

Hi, I met the same problem, could you tell us how to fix the bug?

@blldd
Copy link
Author

blldd commented Oct 11, 2019

Hi, my friend, thank you for your work, I try to debug and find a tiny bug:

in file - 0_gen_sampled_data.py:

unique_cate_id = np.concatenate(
    (ad['cate_id'].unique(), log['cate'].unique()))

lbe.fit(unique_cate_id)

in file - 2_gen_dsin_input.py:

data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')

here merge method lost some data(cate_id, and brand)

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

so here data['brand'].nunique() is small than input data index.

and I log all unique input brand number, and update the fd, then code can run without error.

Hi, I met the same problem, could you tell us how to fix the bug?

Sorry for the late reply, I am not sure whether it is ok or not.

  1. log the dimension in file 0_gen_...:

    pd.to_pickle({
    'cate_id': SingleFeat('cate_id', len(np.unique(unique_cate_id)) + 1),
    'brand': SingleFeat('brand', len(np.unique(unique_brand)) + 1),
    },
    '../model_input/dsin_fd_cate_brand_' + str(FRAC) + '.pkl')

  2. update input fd in train_dsin.py:

    cate_brand_fd = pd.read_pickle('../model_input/dsin_fd_cate_brand_' +
    str(FRAC) + '.pkl')

    fd['sparse'][13] = cate_brand_fd['cate_id']
    fd['sparse'][14] = cate_brand_fd['brand']

  3. rerun the script.

@jellchou
Copy link

Hi, my friend, thank you for your work, I try to debug and find a tiny bug:

in file - 0_gen_sampled_data.py:

unique_cate_id = np.concatenate(
    (ad['cate_id'].unique(), log['cate'].unique()))

lbe.fit(unique_cate_id)

in file - 2_gen_dsin_input.py:

data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')

here merge method lost some data(cate_id, and brand)

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

so here data['brand'].nunique() is small than input data index.

and I log all unique input brand number, and update the fd, then code can run without error.

Hi, I met the same problem, could you tell us how to fix the bug?

Sorry for the late reply, I am not sure whether it is ok or not.

  1. log the dimension in file 0_gen_...:
    pd.to_pickle({
    'cate_id': SingleFeat('cate_id', len(np.unique(unique_cate_id)) + 1),
    'brand': SingleFeat('brand', len(np.unique(unique_brand)) + 1),
    },
    '../model_input/dsin_fd_cate_brand_' + str(FRAC) + '.pkl')
  2. update input fd in train_dsin.py:
    cate_brand_fd = pd.read_pickle('../model_input/dsin_fd_cate_brand_' +
    str(FRAC) + '.pkl')
    fd['sparse'][13] = cate_brand_fd['cate_id']
    fd['sparse'][14] = cate_brand_fd['brand']
  3. rerun the script.

thank you too much, please let me try

@shenweichen shenweichen pinned this issue Oct 29, 2019
@shenweichen
Copy link
Owner

sorry for this mistake, we are planning to refactor our code in the future.
I think this error can be fixed by using

sparse_feature_list = [SingleFeat(feat, data[feat].max(
    ) + 1) for feat in sparse_features + ['cate_id', 'brand']]

instead of

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

@shenweichen shenweichen unpinned this issue Oct 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants