Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word2Vec 0.0.3.3 simple training with 12 tweets throwing NullPointerException at index.allDocs() #99

Closed
owlmsj opened this issue Dec 16, 2014 · 3 comments

Comments

@owlmsj
Copy link

owlmsj commented Dec 16, 2014

I am using word2vec from branch 3.3 and trying to training with 12 docs (tweets). But when fit() starts it is throwing an exception.

The problem is, when I do the same with +- 1500 docs (still tweets), everything runs OK

What's going wrong?

My code:

List<String> sentences = new ArrayList<String>();
List<String> labels= new ArrayList<String>();

sentences = readSentencesFromTweetsFileAndPreProcess();

LabelAwareSentenceIterator docIter = new TutorialLabelAwareIterator(sentences,labels);

UimaTokenizerFactory factory = new UimaTokenizerFactory();

InMemoryLookupCache cache = new InMemoryLookupCache();

WeightLookupTable table = new InMemoryLookupTable.Builder().vectorLength(100).useAdaGrad(false).cache(cache).lr(0.025f).build();

Word2Vec vec = new Word2Vec.Builder()
                .minWordFrequency(0).iterations(5)
                .layerSize(100).lookupTable(table)
                .stopWords(new ArrayList<String>())
                .vocabCache(cache)
                .windowSize(5).iterate(iter).tokenizerFactory(factory).build();

vec.fit();

ERROR:

o.d.t.i.LuceneInvertedIndex - Creating new index writer
o.d.b.v.BaseTextVectorizer - Invoking finish on index
o.d.m.w.Word2Vec - Building binary tree
o.d.m.w.Word2Vec - Constructing priority queue
o.d.m.w.Word2Vec - Built tree
o.d.m.w.Word2Vec - Resetting weights
o.d.m.w.Word2Vec - Training word2vec multithreaded
Exception in thread "main" java.lang.NullPointerException
    at org.deeplearning4j.text.invertedindex.LuceneInvertedIndex.allDocs(LuceneInvertedIndex.java:209)
    at org.deeplearning4j.models.word2vec.Word2Vec.fit(Word2Vec.java:460)
    at br.com.stilingue.deeplearning.Word2VecTutorial.tutorial33_2(Word2VecTutorial.java:66)
    at br.com.stilingue.deeplearning.Word2VecTutorial.main(Word2VecTutorial.java:24)
@owlmsj owlmsj changed the title Word2Vec 0.0.3.3 simple training with 12 tweets throwing exception Word2Vec 0.0.3.3 simple training with 12 tweets throwing exception allDocs() NullPointerException Dec 17, 2014
@owlmsj owlmsj changed the title Word2Vec 0.0.3.3 simple training with 12 tweets throwing exception allDocs() NullPointerException Word2Vec 0.0.3.3 simple training with 12 tweets throwing NullPointerException at index.allDocs() Dec 17, 2014
@owlmsj
Copy link
Author

owlmsj commented Dec 18, 2014

With the last pull (with a clean install from project root) gave me this error on Word2Vec

Exception in thread "worker1" java.lang.IllegalStateException: Given ndarray does not have data type float
    at org.nd4j.linalg.factory.DataTypeValidation.assertFloat(DataTypeValidation.java:26)
    at org.nd4j.linalg.factory.DataTypeValidation.assertFloat(DataTypeValidation.java:17)
    at org.nd4j.linalg.jblas.BlasWrapper.axpy(BlasWrapper.java:186)
    at org.deeplearning4j.models.embeddings.inmemory.InMemoryLookupTable.iterateSample(InMemoryLookupTable.java:208)
    at org.deeplearning4j.models.word2vec.Word2Vec.iterate(Word2Vec.java:398)
    at org.deeplearning4j.models.word2vec.Word2Vec.skipGram(Word2Vec.java:382)
    at org.deeplearning4j.models.word2vec.Word2Vec.trainSentence(Word2Vec.java:359)
    at org.deeplearning4j.models.word2vec.Word2Vec$1.run(Word2Vec.java:166)
    at java.lang.Thread.run(Thread.java:745)

@agibsonccc
Copy link
Contributor

This error comes in to play with data types are initialized wrong. Doubles aren't mixed with floats etc. I fixed this here:
if(neu1e.data().dataType() == DataBuffer.FLOAT) {
Nd4j.getBlasWrapper().axpy((float) g, syn1, neu1e);
Nd4j.getBlasWrapper().axpy((float) g, l1, syn1);

        }

         else {
            Nd4j.getBlasWrapper().axpy(g, syn1, neu1e);
            Nd4j.getBlasWrapper().axpy(g, l1, syn1);

        }

Re open this if you still get that.

@lock
Copy link

lock bot commented Jan 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Jan 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants