Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format of Pre-processed data #2

Closed
Gudakesh opened this issue Jun 1, 2018 · 1 comment
Closed

Format of Pre-processed data #2

Gudakesh opened this issue Jun 1, 2018 · 1 comment

Comments

@Gudakesh
Copy link

Gudakesh commented Jun 1, 2018

Hello.
What is the format of the pre-processed dataset of IFTTT ?
On unpickling the msr_data.pkl file, the data in it is not making much sense to me... because the attributes like correct_action_param, correct_trigger_param, label_names, words, url etc. are occurring together again and again, as a bundle, and after them is another bundle of some other attributes , namely label_types, trigger_chans, action_chans, action_funcs etc and then is a bundle of action and trigger fields, which is followed by some fields which I don't understand but I think are the fields which require user input and are initialized as NULL and then... I don't understand.

Can you tell me how to make sense of the data ? Also, it would be pretty helpful if you can tell me how to see the data in a pretty-formatted format.

@Jungyhuk
Copy link
Owner

Jungyhuk commented Jun 2, 2018

We did not include the raw data we crawled from IFTTT (i.e., the pretty-formatted dataset) due to some privacy concerns. However, we included the code for preprocessing raw data in dataPreprocessor/process_IFTTT_data.py, where you can see the format of our raw data and how they are converted into the pkl file.

@Jungyhuk Jungyhuk closed this as completed Jun 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants