Skip to content

auremoser/mozfest-ether

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mozfest-ether

etherpad stuff for mozfest 2016

Sources

Etherpads were created for each Mozfest 2016 session, accessible from the website via the notes button. Sessions were organized into 11 spaces.

Matthew made a spreadsheet of all etherpads with their titles and topics and some other stats.

Purpose

Hopefully, we can create some kind of reusable process for mining the etherpads. Here's just a test for that.

Requirements

  • extract the text from all the etherpads, maybe just download them all as we did once before for my workweek in London
  • get some stats on what was actually discussed in the etherpads
    • how many etherpads are empty?
    • how much text was written in each of them (character count)?
    • how many times did certain keywords/phrases appear (like "open", "libre", "innovation", "inclusion", "privacy", "science")
    • which "space" was more text heavy, as in, which space took more notes?

Output

index.js script creates:

  • 1 json file per etherpad with text contents
  • 1 stats.json file with brief counts and general things listed in the requirements

index-empty.js script creates:

  • 1 json file per etherpad with text contents
  • 1 stats.json file with brief counts and general things listed in the requirements
  • difference from index.js: this one kicks out default etherpad text from the counts, so that empty or "default text" etherpads are not counted in stats

To run:

  1. Convert Matthew's CSV from the spreadsheet into JSON format using a tool like this one.
  2. npm install request-promise
  3. node index.js
  4. Use the index.js script to collect each etherpad's URLs and download the content for each, creating a txt file for each.

Thanks

Emi for the tips in setting this up!

About

etherpad stuff for mozfest 2016

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published