Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where can I get "imdb-train.pkl"? #3

Open
k-terada opened this issue Oct 10, 2020 · 1 comment
Open

Where can I get "imdb-train.pkl"? #3

k-terada opened this issue Oct 10, 2020 · 1 comment

Comments

@k-terada
Copy link

I'm looking at pytorch-imdb-bert.py
Where can I get "imdb-train.pkl"?

@jmakoske
Copy link
Member

The files imdb-train.pkl and imdb-test.pkl are just slightly processed versions of original data from http://ai.stanford.edu/~amaas/data/sentiment/ . You can get the sentences and polarity values from the original data.

train_df = pd.read_pickle("/media/data2/imdb/imdb-train.pkl")
print(train_df.sample(10))

                                                sentence sentiment  polarity
15135  Just a dumb old movie. First Stanwyck's son ge...         2         0
22916  A meteorite falls in the country of a small to...         7         1
20820  Whether it's a good movie or not, films of thi...         7         1
17389  The '60s is an occasionally entertaining film,...         2         0
20392  As I work at a video store, I found it to be m...         1         0
17671  Everyone in the cast, from Sugiyama to Aoki an...        10         1
16207  The only connection this movie has to horror i...         1         0
19790  The show had great episodes, this is not one o...         4         0
5569   I thought this film was just about perfect. Th...         9         1
21911  I sat through this film and i have to say it o...         1         0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants