Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequence feature in demo "DeepFM_with_sequence_feature.py". #3

Closed
cjfcsjt opened this issue Dec 13, 2021 · 1 comment
Closed

Sequence feature in demo "DeepFM_with_sequence_feature.py". #3

cjfcsjt opened this issue Dec 13, 2021 · 1 comment

Comments

@cjfcsjt
Copy link

cjfcsjt commented Dec 13, 2021

Should field "sequence" share embeddings with the field "adgroup_id"? I found that the method "encoder.fit()" assigns the encoder such as tokenizer for each field. Since the given tiny datasets record the user historical behavior (ad sequence), then in my understanding that the id that appeared in the field "sequence" may also appear in the field "adgroup_id". As a result, it seems that the field "sequence" should share the same encoder (i.e., tokenizer) with the field "adgroup_id", but the demo "DeepFM_with_sequence_feature.py" gives separate encoders for these two fields.

@zhujiem
Copy link
Contributor

zhujiem commented Dec 13, 2021

Thanks for your suggestion. I have make a revision on the config to add share_embedding: "adgroup_id"

taobao_tiny_sequence:
data_root: ../data/
data_format: csv
train_data: ../data/tiny_data/train_sample.csv
valid_data: ../data/tiny_data/valid_sample.csv
test_data: ../data/tiny_data/test_sample.csv
min_categr_count: 1
feature_cols:
[{name: ["userid","adgroup_id","pid","cate_id","campaign_id","customer","brand","cms_segid",
"cms_group_id","final_gender_code","age_level","pvalue_level","shopping_level","occupation"],
active: True, dtype: str, type: categorical},
{name: "click_sequence", active: True, dtype: str, type: sequence, splitter: "^", max_len: 5,
encoder: "MaskedAveragePooling", share_embedding: "adgroup_id"}]
label_col: {name: clk, dtype: float}

@cjfcsjt cjfcsjt closed this as completed Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants