NLPSamples

Andy Whitton, a partner in Deloitte’s data practice, says:

“Full data classification can be a very expensive activity that very few organisations do well. Certified database technologies can tag every data item but, in our experience, only governments do this because of the cost implications.”

Supervised Document Classification: In supervised classification, an external mechanism (such as human feedback) provides correct information on the classification of documents.

Unsupervised Document Classification: In unsupervised document classification, also called document clustering, where classification must be done entirely without reference to external information. Document clustering involves the use of descriptors and descriptor extraction. Descriptors are sets of words that describe the contents within the cluster. Document clustering is generally considered to be a centralized process. Examples of document clustering include web document clustering for search users.

In general, there are two common algorithms.

(i) The first one is the hierarchical based algorithm, which includes a single link, complete linkage, group average and Ward’s method. By aggregating or dividing, documents can be clustered into a hierarchical structure, which is suitable for browsing. However, such an algorithm usually suffers from efficiency problems.

(ii) The other algorithm is developed using the K-means algorithm and its variants. Generally, hierarchical algorithms produce more in-depth information for detailed analyses, while algorithms based around variants of the K-Means algorithm are more efficient and provide sufficient information for most purposes. These algorithms can further be classified as hard or soft clustering algorithms.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md
modelTraining.ipynb		modelTraining.ipynb
modeltest.ipynb		modeltest.ipynb
test.ipynb		test.ipynb
test2.ipynb		test2.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLPSamples

About

Releases

Packages

Languages

License

WEBSHIVOM/NLPSamples

Folders and files

Latest commit

History

Repository files navigation

NLPSamples

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages