Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word2vec basic show id and word together #1877

Merged

Conversation

hunkim
Copy link
Contributor

@hunkim hunkim commented Apr 12, 2016

When I first read the output, I spent a lot of time to understand these numbers to match them. Now, words along with the ids clearly show what's going on in the sample data and prediction pairs.

[New output]: clearly shows the word ids and corresponding words
Sample data [5239, 3084, 12, 6, 195, 2, 3137, 46, 59, 156] ['anarchism', 'originated', 'as', 'a', 'term', 'of', 'abuse', 'first', 'used', 'against']
3084 originated -> 12 as
3084 originated -> 5239 anarchism
12 as -> 6 a
12 as -> 3084 originated
6 a -> 12 as
6 a -> 195 term
195 term -> 6 a
195 term -> 2 of

[Old output]: No words for Sample data. Word ids and words are mixed, so it's very hard to read
Sample data [5239, 3084, 12, 6, 195, 2, 3137, 46, 59, 156] ['anarchism', 'originated', 'as', 'a', 'term', 'of', 'abuse', 'first', 'used', 'against']
3084 -> 5239
originated -> anarchism
3084 -> 12
originated -> as
12 -> 3084
as -> originated
12 -> 6
as -> a
6 -> 195
a -> term
6 -> 12
a -> as
195 -> 6
term -> a
195 -> 2
term -> of

Now, they clearly show  what's going on in the sample data and prediction pairs.

[New output]: clearly shows the word ids and corresponding words
      Sample data [5239, 3084, 12, 6, 195, 2, 3137, 46, 59, 156] ['anarchism', 'originated', 'as', 'a', 'term', 'of', 'abuse', 'first', 'used', 'against']
      3084 originated -> 12 as
      3084 originated -> 5239 anarchism
      12 as -> 6 a
      12 as -> 3084 originated
      6 a -> 12 as
      6 a -> 195 term
      195 term -> 6 a
      195 term -> 2 of

[Old output]: No words for Sample data. Word ids and words are mixed, so it's very hard to read
      Sample data [5239, 3084, 12, 6, 195, 2, 3137, 46, 59, 156] ['anarchism', 'originated', 'as', 'a', 'term', 'of', 'abuse', 'first', 'used', 'against']
      3084 -> 5239
      originated -> anarchism
      3084 -> 12
      originated -> as
      12 -> 3084
      as -> originated
      12 -> 6
      as -> a
      6 -> 195
      a -> term
      6 -> 12
      a -> as
      195 -> 6
      term -> a
      195 -> 2
      term -> of
@tensorflow-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

@vrv
Copy link

vrv commented Apr 15, 2016

@tensorflow-jenkins: test this please

@gunan
Copy link
Contributor

gunan commented Apr 15, 2016

Can one of the admins verify this patch?

@hunkim
Copy link
Contributor Author

hunkim commented Apr 15, 2016

@vrv Hmm. I am not sure where/how this BR brakes those tests. It was fine in my environment. I'll double check.

@vrv
Copy link

vrv commented Apr 15, 2016

no, jenkins broke, we restarted it. let me try again: @tensorflow-jenkins test this please

@vrv
Copy link

vrv commented Apr 15, 2016

Jenkins keeps breaking but python3/mac tests passed so this is syntactically fine. Merging.

@vrv vrv merged commit 4d9ebec into tensorflow:master Apr 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants