Wikipedia analyser

Project for #YRS2013

Uses neural network from jcla1/nn

Basically analyses wikipedia lexically and structurally to find things that need to be improved. Uses a (self-written) neural-net for analysing structure and an ngram model to analyse the sentence fragments. Built around a pipeline that enables it to be distributed across multiple CPU cores.

To get started look at the different parts the project is divided into in parts.go I haven't tested the ngram code yet since I haven't got enough disk space to process the ngrams.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
ngram		ngram
Makefile		Makefile
README.md		README.md
data_savers.go		data_savers.go
distributors.go		distributors.go
dump_parser.go		dump_parser.go
feature_builder.go		feature_builder.go
main.go		main.go
page_container.go		page_container.go
parts.go		parts.go
pipeline.go		pipeline.go
regexes.go		regexes.go
stats.go		stats.go
xml_structs.go		xml_structs.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wikipedia analyser

About

Releases

Packages

Languages

jcla1/wikipedia_analyser

Folders and files

Latest commit

History

Repository files navigation

Wikipedia analyser

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages