Releases · andrewtavis/kwx · GitHub

28 Dec 15:01

andrewtavis

kwx 1.0.0 Latest

Latest

Release switches kwx over to semantic versioning and indicates that it is stable

Assets 2

29 Apr 11:59

andrewtavis

kwx 0.1.8

Changes include:

Support has been added for gensim 3.8.x and 4.x
Dependencies in requirement and environment files are now condensed
An alert for users when the corpus size is to small for the number of topics was added
An import error for pyLDAvis was fixed

Assets 2

30 Mar 06:34

andrewtavis

kwx 0.1.7.3

Changes include:

Switching over to an src structure
Removing the lda_bert method because its dependencies were causing breaks
Code quality is now checked with Codacy
Extensive code formatting to improve quality and style
Bug fixes and a more explicit use of exceptions
More extensive contributing guidelines
Tests now use random seeds and are thus more robust

Assets 2

15 Mar 13:49

andrewtavis

kwx 0.1.5

Changes include:

Keyword extraction and selection are now disjointed so that modeling doesn't occur again to get new keywords
Keyword extraction and cleaning are now fully disjointed processes
kwargs for sentence-transformers BERT, LDA, and TFIDF can now be passed
The cleaning process is verbose and uses multiprocessing
The user has greater control over the cleaning process
Reformatting of the code to make the process more clear

Assets 2

17 Feb 21:24

andrewtavis

kwx 0.1.0

First stable release of kwx

Changes include:

Full documentation of the package
Virtual environment files
Bug fixes
Extensive testing of all modules with GH Actions and Codecov
Code of conduct and contribution guidelines

Assets 2

31 Jan 13:25

andrewtavis

kwx 0.0.2.2

The minimum viable product of kwx:

Users are able to extract keywords using the following methods
- Most frequent words
- TFIDF words unique to one corpus when compared to others
- Latent Dirichlet Allocation
- Bidirectional Encoder Representations from Transformers
- An autoencoder application of LDA and BERT combined
Users are able to tell the model to remove certain words to fine tune results
Support is offered for a universal cleaning process in all major languages
Visualization techniques to display keywords and topics are included
Outputs can be cleanly organized in a directory or zip file
Runtimes for topic number comparisons are estimated using tqdm

Assets 2