pubscout

A web-based publications viewer/sorter for LBNL publications hosted on CDL's Symplectic Elements instance.

A running instance of this can be seen at

https://skunkworks.lbl.gov/scout

How it works

Pubscout gets data from three sources:

* the HR file that LBL creates and sends to CDL on a weekly basis.
* A list of LBNL publications curated by DOE OSTI
* The CDL's Symplectic Elements installation

A cron job runs a series of three queries of those sources on a daily basis and deposits the raw data into a local folder.

After running those queries, a node.js web server can be started, which will load those files into memory and serve up a website that allows reasonably easy slicing and dicing of the data.

Installation & Setup

Start with a blank linux system. I use Ubuntu 18.04
Install nginx

apt-get install nginx

Configure nginx with ssl as preferred and set up forarding as such:

   location /scout/ {
        proxy_set_header     Host $host;
        proxy_set_header     X-Real-IP $remote_addr;
        proxy_set_header     X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header     X-Forwarded-Proto $scheme;
        proxy_read_timeout   90;
        proxy_pass           http://localhost:10001;
        client_max_body_size 32m;
        client_body_timeout  60s;
    }

Install node.js. Use NVM to get a recent stable version. As of this writing, that is a 10.x.y series.

Follow instructions here: https://github.com/nvm-sh/nvm
run npm install to get the requisite libraries
edit the hrcreds.json and dbcreds.json files to include your credentials to access the CDL ftp site, and the CDL Elements Reporting Database, respectively.
Obtain recent versions of the .csv files for the osti subfolder. These come from Excel spreadsheets mailed by OSTI on a regular basis
(optional) You can test database access by running the test.js on the test/ folder
Adjust and install the pubscout.service file for systemd as necessary.

On ubuntu systems, service files of in /etc/systemd/service. After copying the file, run systemctl daemon-reload, but do not start the service yet.
Download and prepare all the data by running update.sh
Assuming that worked, start the server by running: systemctl start pubscout
Set up automatic daily download of data using update.sh by adding it as a cron job. Run crontab -e and then add the following line to run daily at 3am:

0 3 * * * /home/dgj/projects/library/scout/update.sh

Set up automatic restarting of the pubs service on a daily basis. Unlike step 9, this should be done as root, so use sudo crontab -e and add this line:

0 4 * * * sudo systemctl restart pubscout

Author

Dave Jacobowitz (github djacobow)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
db		db
dbtest		dbtest
group_images		group_images
lib		lib
osti		osti
sql		sql
static		static
README.md		README.md
copy_updates.js		copy_updates.js
dbcreds.json		dbcreds.json
get_elements.js		get_elements.js
get_hr.js		get_hr.js
get_osti.js		get_osti.js
hrcreds.json		hrcreds.json
make_divisions.js		make_divisions.js
package-lock.json		package-lock.json
package.json		package.json
profile_server.js		profile_server.js
pubscout.service		pubscout.service
update.sh		update.sh

eScholarship/pubscout

Folders and files

Latest commit

History

Repository files navigation

pubscout

How it works

Installation & Setup

Author

About

Resources

Stars

Watchers

Forks

Languages