diff --git a/README.md b/README.md index b3fb8de6e..130746988 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Head Start -Head Start is a web-based knowledge mapping software intended to give researchers a head start on their literature review (hence the name). It comes with a powerful backend that is is capable of automatically producing knowledge maps from a variety of data, including text, metadata and references. +Head Start is a web-based knowledge mapping software intended to give anyone a head start on their literature search (hence the name). It comes with a scalable backend that is capable of automatically producing knowledge maps from a variety of data sources. ![Head Start](headstart.png) @@ -9,7 +9,7 @@ Head Start is a web-based knowledge mapping software intended to give researcher ### Client To get started, clone this repository. Next, duplicate the file `config.example.js` in the root folder and rename it to `config.js`. -Make sure to have installed `node` version >= 14.18.1 and `npm` version >=8.1.1 (best way to install is with [nvm](https://github.com/nvm-sh/nvm), `nvm install 14.18.1`) and run the following command to install the Headstart dependencies: +Make sure to have installed `node` version >= 18.20.0 and `npm` version >=10.7.0 (best way to install is with [nvm](https://github.com/nvm-sh/nvm), `nvm install 18.20.0`) and run the following command to install the Headstart dependencies: npm install @@ -19,119 +19,47 @@ We use [webpack](https://webpack.github.io/) to build our client-side applicatio The browser will automatically open a new window with the example. -You can run also different examples - -- `npm run example:pubmed` will run the PubMed example -- `npm run example:triple` will run the GoTriple example -- `npm run example:viper` will run the Viper example -- `npm run example:covis` will run the CoVis example - -If everything has worked out, you should see the example visualization. - -To run Headstart on a different server (e.g. Apache), you need to set the publicPath in `config.js` to the URL of the `dist` directory: -* Dev: specify the full path including protocol, e.g. `http://localhost/headstart/dist` -* Production: specify the full path excluding protocol, e.g. `//example.org/headstart/dist` - -Then build it with the command `npm run prod`. The build will appear in the _dist/_ folder in the root directory. - -You can also set the `skin` property in the config to one of the following values to use the -particular data integration skin: - -- `"covis"` -- `"triple"` -- `"viper"` - -or leave it empty (`""`) for the default project website skin. - -See [client configuration](doc/README.md) for details on adapting the client. - - Also see visualization [options](doc/README.md#visualisation-settings). - -### Server - -See [Installing and configuring the server](doc/server_config.md) for instructions on how to install and configure the server. Also, see [HOWTO: Get the search repos example to work](doc/howto_search_repos.md). - -Make sure to have installed `node` version >= 14.18.1 and `npm` version >=8.1.1 (best way to install is with [nvm](https://github.com/nvm-sh/nvm), `nvm install 14.18.1`) and run the following two commands to build the Headstart client: - - npm install - npm run dev - -We are using [webpack](https://webpack.github.io/) to build our client-side application. `webpack` is started in *watch mode* which means that changes to files are tracked and the created `headstart.js` is automatically updated. - -Now you can run a local dev server: - - npm start - -Note: you can also set the skin in this step as an argument to the `npm start` command (e.g. `npm start -- --env skin=triple`). - -The browser will automatically open a new window with the example specified by the skin. - -Alternatively, you can point your browser to one of the following addresses: - - http://localhost:8080/project_website/base.html - http://localhost:8080/project_website/pubmed.html - http://localhost:8080/local_covis/ - http://localhost:8080/local_triple/map.html - http://localhost:8080/local_triple/stream.html - http://localhost:8080/local_viper/ +You can run also run the PubMed example using `npm run example:pubmed` If everything has worked out, you should see the example visualization. -To run Headstart on a different server (e.g. Apache), you need to set the publicPath in `config.js` to the URL of the `dist` directory: -* Dev: specify the full path including protocol, e.g. `http://localhost/headstart/dist` -* Production: specify the full path excluding protocol, e.g. `//example.org/headstart/dist` - - ## Contributors -Maintainer: [Peter Kraker](https://github.com/pkraker) ([pkraker@openknowledgemaps.org](mailto:pkraker@openknowledgemaps.org)) - -Authors: [Maxi Schramm](https://github.com/tanteuschi), [Christopher Kittel](https://github.com/chreman), [Jan Konstant](https://github.com/konstiman), [Asura Enkhbayar](https://github.com/Bubblbu), [Scott Chamberlain](https://github.com/sckott), [Rainer Bachleitner](https://github.com/rbachleitner), [Yael Stein](https://github.com/jaels), [Thomas Arrow](https://github.com/tarrow), [Mike Skaug](https://github.com/mikeskaug), [Philipp Weissensteiner](https://github.com/wpp), and the [Open Knowledge Maps team](http://openknowledgemaps.org/team) - - -## Features - -* Interactive, web-based knowledge maps based on [D3.js](https://d3js.org), following Shneiderman's principle of "overview first, zoom and filter, then details-on-demand" -* Synchronized list representation of documents complementing the knowledge map -* Integrated PDF viewer and annotation tool, courtesy of [Hypothes.is](https://hypothes.is) -* Powerful server component written in PHP and R for the creation of knowledge maps, including algorithms for clustering, ordination and labelling -* Connectors to a number of academic search engines through [rOpenSci](https://ropensci.org), including [BASE](https://base-search.net), [PubMed](https://www.ncbi.nlm.nih.gov/pubmed), [PLOS](https://plos.org) and [DOAJ](https://doaj.org) -* Persistence and versioning system based on SQLite +Maintainer: [Christopher Kittel](https://github.com/chreman) ([christopher.kittel@openknowledgemaps.org](mailto:christopher.kittel@openknowledgemaps.org)), [Maxi Schramm](https://github.com/tanteuschi) ([maxi@openknowledgemaps.org](mailto:maxi@openknowledgemaps.org)), and [Peter Kraker](https://github.com/pkraker) ([pkraker@openknowledgemaps.org](mailto:pkraker@openknowledgemaps.org)) +Authors: [Thomas Arrow](https://github.com/tarrow), [Andrei Shket](https://github.com/andreishket), [Sergey Krutilin](https://github.com/modsen-hedgehog), [Alexandra Shubenko](https://github.com/vrednyydragon), [Jan Konstant](https://github.com/konstiman), [Asura Enkhbayar](https://github.com/Bubblbu), [Scott Chamberlain](https://github.com/sckott), [Rainer Bachleitner](https://github.com/rbachleitner), [Yael Stein](https://github.com/jaels), [Mike Skaug](https://github.com/mikeskaug), [Philipp Weissensteiner](https://github.com/wpp), and the [Open Knowledge Maps team](http://openknowledgemaps.org/team) ## Showcases -* [Open Knowledge Maps](https://openknowledgemaps.org/): Creates a visualization on the fly based on a user's search in either BASE or PubMed. -* [VIPER - The Visual Project Explorer](https://openknowledgemaps.org/viper/): Provides overviews of research projects indexed by OpenAIRE. -* [CRIS Vis](https://ois.lbg.ac.at/en/cris-I-research-questions): Enables the exploration of crowd-sourced research questions related to mental health. -* [Overview of Educational Technology](https://openknowledgemaps.org/educational-technology): A working prototype for the field of educational technology based on co-readership. -* [OpenUP Dissemination Toolbox](https://www.openuphub.eu/tools): A prototype showcasing an overview of innovative dissemination case studies. -* [Conference Navigator 3](http://halley.exp.sis.pitt.edu/cn3/visualization.php?conferenceID=131) [registration required]: An adaptation of Head Start for the conference scheduling system CN3. This version enables users to schedule papers directly from the visualization. Scheduled papers and recommended papers are highlighted. +* [Open Knowledge Maps Search](https://openknowledgemaps.org/): Creates a visualisation on the fly based on a user's search in either BASE or PubMed. +* [OKMaps Custom Services](https://openknowledgemaps.org/custom): Enable third parties to embed customisable search components and visualisations. +* [VisConnect](https://openknowledgemaps.org/visconnect): Provides an interactive visual profile of a researcher’s work. -## Compatibility +## Browser compatibility -The visualization has been successfully tested with Chrome, Firefox, Safari and Microsoft Edge. Unfortunately, Internet Explorer is not supported due to the fact that it is not possible to insert HTML into a foreignObject. +The frontend has been successfully tested with Chrome, Firefox, Safari and Microsoft Edge. Unfortunately, Internet Explorer is not supported due to the fact that it is not possible to insert HTML into a foreignObject. ## Background More information can be found in the following papers: +Kraker, P., Beardmore, L., Hemila, M., Johann, D., Kaczmirek, L. & Schubert, C. (2024). [Partizipative Modelle im Zusammenspiel von Bibliotheken und KI-Systemen: Drei Fallstudien zur Integration der visuellen Recherche-Plattform Open Knowledge Maps](https://www.b-i-t-online.de/heft/2024-04-fachbeitrag-kraker.pdf). B.I.T. Online, 27(4), 327-335. + +Kraker, P., Goyal, G., Schramm, M., Akin, J., & Kittel, C. (2021). [CoVis: A curated, collaborative & visual knowledge base for COVID-19 research](https://doi.org/10.5281/zenodo.4586079). Zenodo. doi: 10.5281/zenodo.4586079 + Kraker, P., Schramm, M., Kittel, C., Chamberlain, S., & Arrow, T. (2018). [VIPER: The Visual Project Explorer](https://zenodo.org/record/1248119). Zenodo. doi:10.5281/zenodo.2587129 Kraker, P., Kittel, C., & Enkhbayar, A. (2016). [Open Knowledge Maps: Creating a Visual Interface to the World’s Scientific Knowledge Based on Natural Language Processing](https://doi.org/10.12685/027.7-4-2-157). 027.7 Journal for Library Culture, 4(2), 98–103. doi:10.12685/027.7-4-2-157 Kraker, P., Schlögl, C. , Jack, K. & Lindstaedt, S. (2015). [Visualization of Co-Readership Patterns from an Online Reference Management System](http://arxiv.org/abs/1409.0348). Journal of Informetrics, 9(1), 169–182. doi:10.1016/j.joi.2014.12.003 -Kraker, P., Weißensteiner, P., & Brusilovsky, P. (2014). [Altmetrics-based Visualizations Depicting the Evolution of a Knowledge Domain](http://know-center.tugraz.at/download_extern/papers/sti_visualization_evolution_kraker_etal.pdf). In 19th International Conference on Science and Technology Indicators (pp. 330–333). - Kraker, P., Körner, C., Jack, K., & Granitzer, M. (2012). [Harnessing User Library Statistics for Research Evaluation and Knowledge Domain Visualization](http://know-center.tugraz.at/download_extern/papers/user_library_statistics.pdf). Proceedings of the 21st International Conference Companion on World Wide Web (pp. 1017–1024). Lyon: ACM. doi:10.1145/2187980.2188236 ## License Head Start is licensed under [MIT](LICENSE). +## Funding + -## Citation -If you use Head Start in your research, please cite it as follows: - -Peter Kraker, Christopher Kittel, Maxi Schramm, Jan Konstant, Rainer Bachleitner, Thomas Arrow, Scott Chamberlain, Asura Enkhbayar, Yael Stein, Philipp Weissensteiner, Mike Skaug, Katrin Leinweber & Open Knowledge Maps team and contributors. (2019, March 7). Headstart 5 (Version v5). Zenodo. http://doi.org/10.5281/zenodo.2587129 +This project has received funding from the European Union's Horizon 2020 and Horizon Europe research and innovation programmes, under grant agreement nos. 831644, 863420, and 101129751. diff --git a/docker-compose.yml b/docker-compose.yml index 64cae54e6..16336b40d 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -1,13 +1,13 @@ services: - db: - image: 'postgres:12.2-alpine' + image: "postgres:12.2-alpine" hostname: "${POSTGRES_HOSTNAME}" restart: unless-stopped environment: POSTGRES_USER: "${POSTGRES_USER}" POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}" - command: postgres -c config_file=/etc/postgresql.conf -c hba_file=/etc/pg_hba.conf + command: + postgres -c config_file=/etc/postgresql.conf -c hba_file=/etc/pg_hba.conf volumes: - db_data:/var/lib/postgresql/data - ./local_dev/pg_hba.conf:/etc/pg_hba.conf @@ -16,18 +16,26 @@ services: - headstart redis: - image: 'redis:6.0-alpine' + image: "redis:6.0-alpine" restart: unless-stopped hostname: "${REDIS_HOST}" environment: REDIS_HOST: "${REDIS_HOST}" REDIS_PORT: "${REDIS_PORT}" - command: ["redis-server", "/etc/redis/redis.conf", "--bind", "${REDIS_HOST}", "--port", "${REDIS_PORT}"] + command: + [ + "redis-server", + "/etc/redis/redis.conf", + "--bind", + "${REDIS_HOST}", + "--port", + "${REDIS_PORT}", + ] volumes: - - 'redis:/var/lib/redis/data' - - ./local_dev/redis.conf:/etc/redis/redis.conf + - "redis:/var/lib/redis/data" + - ./local_dev/redis.conf:/etc/redis/redis.conf ports: - - "127.0.0.1:${REDIS_PORT}:${REDIS_PORT}" + - "127.0.0.1:${REDIS_PORT}:${REDIS_PORT}" networks: - headstart @@ -80,6 +88,7 @@ services: - ./server/workers/persistence/src:/api depends_on: - redis + - db networks: - headstart @@ -264,7 +273,6 @@ services: networks: - headstart - volumes: redis: db_data: diff --git a/headstart.png b/headstart.png index a927d0d5c..085d33270 100644 Binary files a/headstart.png and b/headstart.png differ diff --git a/server/workers/README.md b/server/workers/README.md deleted file mode 100644 index c0c42fbca..000000000 --- a/server/workers/README.md +++ /dev/null @@ -1,164 +0,0 @@ -## This documentation is not up-to-date and following it will not result in a running server backend for Headstart. We apologize for any inconvenience this may cause and ask for patience until the public documentation has been updated. - - -## Folder structure - -Following backend component containers are currently in `workers`: - -* dataprocessing: Executing the machine learning and natural language processing -* services: a Flask-based API, providing endpoints for each integrated data source - -Each comes with a docker file (ending on `.docker`), which is used for creating a container, and a source code folder. - -## Setup - -### Install docker and docker-compose - -Please follow the install instructions for your OS: - -* Mac: https://docs.docker.com/docker-for-mac/install/ -* Ubuntu: https://docs.docker.com/docker-for-mac/install/ (also available for other Linux) - -Please follow the install instructions for docker-compose for your OS: https://docs.docker.com/compose/install/ - -### Setting up the Apache2 reverse proxy - -Following Apache2 mods have to be installed and enabled: - -* ssl -* proxy -* proxy_balancer -* proxy_http - -Possibly also following modules need to be installed and enabled: -* mod_slotmem_shm - -The following lines have to be added to the appropriate sites-available config of Apache2 webserver: - -``` - - # - # other config - - # Proxy server settings for Head Start API - - Deny from all - Allow from 127.0.0.1 - ProxyPass http://127.0.0.1:8080/ - ProxyPassReverse http://127.0.0.1/api - - - -``` - -After that, restart the Apache2 service. - -## Configuration - -Setting up configurations for each backend service: - -Dataprocessing: -* In `server/workers/dataprocessing` copy `example_dataprocessing.env` to `dataprocessing.env` and set the desired loglevel. - -Services: -* In `server/workers/services/src/config` copy `example_settings.py` to `settings.py` and change the values for `ENV` (`development` or `production`) and `DEBUG` (`TRUE` or `FALSE`). -* In `settings.py` you can also configure databases. - -Secure Redis: -* In `server/workers` copy `example_redis.conf` to `redis.conf` and replace "long_secure_password" with a long, secure password (Line 507 in redis.conf, parameter `requirepass`). - -Secure Postgres: -* In `server/workers` duplicate `example_pg_hba.conf` to `pg_hba.conf` and review the settings. The default values should be ok for a default deployment (host connections are only allowed for user "headstart" with an md5-hashed password), but you may want to change access rights. - - -Overall deployment environment variables: -PostgreSQL service: -* In `server/workers/flavorconfigs` folder create a new `flavorname.env` from the `example.env` and fill in the environment variables with the correct login data. - * This includes Postgresql and redis settings - - -* Manual database creation for Postgres: - -Enter container: `docker exec -it VARYINGNAME_db_1 psql -U headstart` - -Execute command: `CREATE DATABASE databasename;` - -Exit the container and re-enter it as normal user: `docker exec -it VARYINGNAME_persistence_1 /bin/bash` - -Execute command: `python manage.py` - -* In `preprocessing/conf/config_local.ini` change "databasename" to the dev/production database name for the specific integration. This should be in line with the database names provided in `settings.py` - - -* Running backup processes for postgres-volumes: - -https://hub.docker.com/p/loomchild/volume-backup - -### Adding a new versioned "flavor" of the backend - - -1. Make changes to code in `server/workers` (any API /integration, …) -1. Commit changes -1. Checkout commit (make note of commit hash) -1. Run `server/workers/build_docker_images.sh` -1. Create new {flavor}.env in `server/workers/flavorconfigs/` using `example.env` as template. Set the “COMPOSE_PROJECT_NAME={flavor}” and the SERVICE_VERSION={commit hash} to the values from step 3. -1. Run `docker-compose up --env-file server/workers/flavorconfigs/flavor.env -d` to start the services -1. Add new entry to `server/workers/proxy/templates/default.conf.templates` -1. Add flavored networks to `server/workers/proxy/docker-compose.yml` so that the Nginx-proxy knows where to find the specific versioned services -1. Down and up the proxy service from `server/workers/proxy` working directory -1. Test by e.g. `curl -vvvv localhost/api/{flavor}/base/service_version` - - -### Starting a specific versioned "flavor" of the backend services with docker-compose - -Following commands have to be executed from the root folder of the repository, where `docker-compose.yml` is located. - -**Start services and send them to the docker daemon** - -``` -docker-compose --env-file server/workers/flavorconfigs/flavor.env up -d -``` - - -**Shutting service down** - -``` -docker-compose --env-file server/workers/flavorconfigs/flavor.env down -``` - - -### Adding a new service to the backend - -1. Add service configuration in docker-compose.yml - 1. Add required environment variables that need to be passed from .env to container in docker-compose.yml -1. Add service related changes in build-docker-images.sh - 1. Add service to build list -1. Add service source code and Dockerfile in a new folder in `server/workers` -1. Add new env variables to .env files - - -### Integrating with clients - -In `server/preprocessing/conf/config_local.ini` change the following configs: -``` -# URL to OKMaps API -api_url = "http://127.0.0.1/api/" -# flavor of API, default: "stable" -api_flavor = "stable" -# The persistence backend to use - either api or legacy -persistence_backend = "api" -# The processing backend to use - either api or legacy -processing_backend = "api" -``` - - -## Updating R dependencies - -1. start rstudio -2. navigate to folder of worker file, e.g. /workers/base: setwd("~/projects/OpenKnowledgeMaps/Headstart/server/workers/base") -3. initiate renv with renv::activate() -4. check if dependencies.R is up to date -5. make any updates to packages as required, e.g. installing remotes::install_github('OpenKnowledgeMaps/rbace', force=TRUE) -6. update renv.lock file with renv::snapshot() -7. review lock file -8. if OK, commit lockfile