New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XLNET Base for Malay and Indonesian languages (not an issue) #160
Comments
Thanks for sharing it (and the parameters you used for pre-training) 👍 Could you say something to the loss you achieved after 700K epochs? |
My loss around 2.XX after 700k steps, never get into 1.XX during entire training sessions |
Hi, can you tell what hardware did you use for training and how long it took on base and small models. |
Single Tesla V100 16GB vram, 700k steps took around 4 days for base, and 2 days for small models. |
Thank you for your response, and how many sentences do you have ? |
Is there one for English as well? |
@3NFBAGDU , around 600k plus, total size 1.21GB pure text. @abhi060698 I believe you can download from original repository? |
@huseinzol05 From what I saw there's only the Large model. Do you have a link to the Base model? |
Nopeeeee :( |
Hi, i have trained XLNET model 1 weak ago. Now i want to train base model with 3,000,000 sentences, and i want to change my train_gpu parameters foloowing: I think in second example number of iterations is too few, what do you think what parameter should i choose to make a base model? I want to make a sentence embeddings, i have used sent2vec for this and had a good results, but i want to improve it with XLNET. |
You can increase |
Hello, Do you know how can i get sentence-embedding vector for some sentence. |
Actually if you read from run_classifier.py, you can get the answer, X = tf.placeholder(tf.int32, [None, None])
segment_ids = tf.placeholder(tf.int32, [None, None])
input_masks = tf.placeholder(tf.float32, [None, None])
xlnet_model = xlnet.XLNetModel(
xlnet_config=xlnet_config,
run_config=xlnet_parameters,
input_ids=tf.transpose(self.X, [1, 0]),
seg_ids=tf.transpose(self.segment_ids, [1, 0]),
input_mask=tf.transpose(self.input_masks, [1, 0]))
summary = xlnet_model.get_pooled_out("last", True) # your vector for 'Hello, how are you'. |
@ @huseinzol05 hi I have a question , Everytime I run it , I get the different vector for the same sentence . It drives me crazy |
@huseinzol05, when you pretrained your model, did you see error like this #168. Thank you |
@luv4me , lol, obviously, we have finite space, our neural network will cut off some floating points. So, give its break. |
@vanh17 , sounds like your data is an empty array. Do you checked your data is not empty? |
@huseinzol05 how could we really check for it? I ran the data_utils.py on the txt file where each sentence is on one line and there is an empty line at the end of each document before the next document is being inserted. |
@huseinzol05 I have the exact same problem as @luv4me , each time I'm getting a drastically different vector for the same sentence, the values are not even remotely similar. Is it really the fault of floating points? It seems to me like something is either inconsistent there or the model is training (even if I set EDIT: OK, I found out that I can get the same vector consistently if I set the random seed of TensorFlow with: |
You should put is_training equal to False, or else the dropout layer randomly applied zaro masking. |
Even after you put is_training is False the different is quite big? |
@huseinzol05 yes, I set is_training and is_finetune to False both. I am still getting random outputs.
|
How many node is in the layers if you know, and is it fully connected ? |
Hi! This is not an issue, I just want to say XLNET is really great and I successfully pretrained XLNET from scratch for Malay and Indonesian languages. You can read comparison and download pretrained from here, https://github.com/huseinzol05/Malaya/tree/master/xlnet
I am planning to release XLNET Large for these languages!
The text was updated successfully, but these errors were encountered: