NodeJournal (v0.0.1)

NodeJournal reads the site and summarizes it like newspaper.

In concrete terms, first it reads the site's text contents then classifies these into title and detail by model.

(neuraln needs C compiler when installing).

You can train the model as below.

download site contents. You will get dataset.txt
node training/GetDataset.js (site-url)
copy the dataset.txt to training.txt and add labels for supervised lerning.
prepare model.json to define the model (layer architecture).

{
    "layers": [ 5, 10, 10, 3]
}

training the model. training result is saved to modelMemory.txt
node training/Training.js
If you have modelMemory.txt already, it will be loaded before lerning.

Then, you can read the site. The result is saved in crawled.txt.

node Crawl.js (site-url)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
training		training
typings		typings
.gitignore		.gitignore
Crawl.js		Crawl.js
Crawl.js.map		Crawl.js.map
Crawl.ts		Crawl.ts
Crawler.js		Crawler.js
Crawler.js.map		Crawler.js.map
Crawler.ts		Crawler.ts
README.md		README.md
Scanner.js		Scanner.js
Scanner.js.map		Scanner.js.map
Scanner.ts		Scanner.ts
TextScanner.js		TextScanner.js
TextScanner.js.map		TextScanner.js.map
TextScanner.ts		TextScanner.ts
crawler.njsproj		crawler.njsproj
package.json		package.json
tsd.json		tsd.json

Provide feedback