how to train a model using the whole corpus (ignoring partitions) #34

cayaluke · 2021-10-26T12:56:11Z

Dear Silvia,

Love your work!

I have a silly question. How do you train a model using the entire corpus (ignoring partitions)?

Thank you for your help.

Luke

silviatti · 2021-10-30T13:43:19Z

Hello!
Thank you for using OCTIS :)

Not a silly question because I've realized there's something to fix in the code and in the documentation :)
Every model has a method called "partitioning". If you pass "False" then the whole dataset will be used as training, the dataset will be split (as the default). You must call this method before training or optimization. For example:

model = ETM(num_topics=25) 
model.partitioning(False)
model_output = model.train_model(dataset)

However, some models have also the parameter "use_partitions" in the initialization which can be set to "False" to obtain the same effect. I think this is probably the easiest way to do it, but not all models have this parameter (it seems everyone has it except LSI and LDA). I think I'm going to fix this for the next release. See the example below:

model = ETM(num_topics=25, use_partitions=False) 
model_output = model.train_model(dataset)

Thank you for your patience.

Silvia

cayaluke · 2021-10-31T12:48:27Z

Thank you for your response.

It worked.

Grazie mille!

cayaluke closed this as completed Oct 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to train a model using the whole corpus (ignoring partitions) #34

how to train a model using the whole corpus (ignoring partitions) #34

cayaluke commented Oct 26, 2021

silviatti commented Oct 30, 2021

cayaluke commented Oct 31, 2021

how to train a model using the whole corpus (ignoring partitions) #34

how to train a model using the whole corpus (ignoring partitions) #34

Comments

cayaluke commented Oct 26, 2021

silviatti commented Oct 30, 2021

cayaluke commented Oct 31, 2021