Skip to content
This repository was archived by the owner on Sep 27, 2022. It is now read-only.

CoEDL/olac-visualisation-v1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This application is no longer being actively developed. A new version can be found @ https://github.com/CoEDL/olac-visualisation-v2

OLAC Visualisation

About this project

This work is led by Nick Thieberger at the University of Melbourne as part of the Centre of Excellence for the Dynamics of Language (ARC grant CE140100041). alt text

Preamble

This code is in two parts:

Hacking on the angular app

After you've cloned the repo:

- cd app
- npm install
- bower install
- grunt serve

The app will be available at http://{YOUR HOST}:9000

Hacking on the python scraper

Ensure you have python 2.7 and lxml and you should be good to go.o

There are two modes of invocation: one to scrape the language pages and produce a json representation of each and a second to produce the summary pages. The invocations are as follows:

./process-language-pages.py --languages languages.csv --pages http://www.language-archives.org/language --output $OUTPUT --info
./create-summary.py --languages $OUTPUT --info

Here's the help:

 ./process-language-pages.py --help
usage: process-language-pages.py [-h] --languages LANGUAGES --pages PAGES
                                 --output OUTPUT [--one] [--refresh] [--info]
                                 [--debug]

Process OLAC Language Pages

optional arguments:
  -h, --help            show this help message and exit
  --languages LANGUAGES
                        The path to the CSV file containing the language codes
  --pages PAGES         The base pages URL: Probably: http://www.language-
                        archives.org/language
  --output OUTPUT       The folder to store the JSON representation.
  --one                 Process only one language code.
  --refresh             Ignore data and reprocess.
  --info                Turn on informational messages
  --debug               Turn on full debugging (includes --info)

Installing the app on your server

Assuming you have a linux server with the pre-requisites installed and ready (python 2.7, lxml and a web server), installation consists of downloading the scraper scripts into a suitable location and cloning the web app into a folder configured to be served via the webserver.

To get the app clone testing branch viz (assuming you're in the folder you want the code):

git clone -b testing git@github.com:MLR-au/olac-visualisation.git .

To get the scraper scripts clone master viz (again, assuming you're in the folder you want to install the code):

git clone git@github.com:MLR-au/olac-visualisation.git

Then you can set up cron jobs to scrape the language archives site nightly / weekly as desired. See the section Hacking on the python scraper for an example invocation. Note that you must set $OUTPUT to the the folder data in the web app folder.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •