-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to find files #1
Comments
Also, you have mentioned "from features import lexical, syntactic, writing_density, |
This source code is a part of a large project structure. The provided codes are basically for showing the deep learning architectures. This codebase "is not" a running system. For the project, the raw data files from the organizers were preprocessed first. Then the experiments were run. So there were several preprocessed files. You have to code to generate them. Not the actual files. |
Thank You Sudipta for clarifying most of my doubts. I am still little confused about how you got the following files: For the first one, did you get csv file after merging train, trial and test json files? |
Exactly. All the data were merged into a single csv for easy manipulation. |
Hi Sudipta, |
Hi, if you go through the paper you will get the idea about the process. We used senticnet. |
Hi Sudipta, |
a) We have used all the things with two versions. One for microblogs and another for the headlines. So one code is used to generate processed data for both dataset. Hope it helps. All the best. |
Hi Sudipta,
Thank You for clarifying my doubts. You have been really very helpful.
There is still one doubt that I was trying to fix on my own, but I
couldn't. When you run the model and invoke the function:
pack_data_to_format(), I get the following error, and I am unable to find
out a fix.
[image: Inline image 2]
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
…On Fri, Nov 3, 2017 at 7:36 PM, Sudipta Kar ***@***.***> wrote:
a) We have used all the things with two versions. One for microblogs and
another for the headlines. So one code is used to generate processed data
for both dataset.
b) We finally didn't use it as we created the concept vectors during the
preprocessing step. But if you want to use that, you can use the simple
tf-idf vectorizer, as it will be modeled as bag of concepts.
Hope it helps. All the best.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ATPk0Hb5Gttp2muAxyWQZuZ2CRiitDfEks5sy6NigaJpZM4QMLB2>
.
|
Hi Sudipta, You mentioned that 'About the features, lexical, embeddings, and sentiment features are relevant.' But I can only see Thanks. |
The sentiment features were extracted code in the preprocessing. The code was done in hurry, so not exactly structured. |
Hi Sudipta,
I am unable to find DATA_FILES_LIST, ORIGINAL_DATA_DIR, RAW_DATA_PATH in config.py file.
Also, what is mb_train_trial_test_new_prs.csv file for? The training and test data is in json format.
The text was updated successfully, but these errors were encountered: