BIGSdb consists of two main Perl scripts, bigsdb.pl and bigscurate.pl, that run the query and curator's interfaces respectively. These need to be located somewhere within the web cgi-bin directories. In addition, there are a large number of library files, used by both these scripts, that are installed by default in /usr/local/lib/BIGSdb. Plugin scripts are stored within a 'Plugins' sub-directory of this library directory.
All databases on a system can use the same instance of the scripts, or alternatively any database can specify a particular path for each script, enabling these script directories to be protected by apache htaccess directives.
- :doc:`Software requirements <dependencies>`
- Download from SourceForge.net or GitHub.
Unpack the distribution package in a temporary directory:
gunzip bigsdb_1.x.x.tar.gz tar xvf bigsdb_1.x.x.tar
Copy the bigsdb.pl and bigscurate.pl scripts to a subdirectory of your web server's cgi-bin directory. Make sure these are readable and executable by the web server daemon.
Copy the contents of the lib directory to /usr/local/lib/BIGSdb/. Make sure you include the Plugins and Offline directories which are subdirectories of the main lib directory.
Copy the contents of the javascript directory to a javascript directory within the web root tree, i.e. accessible from http://your_website/javascript/.
Copy the bigsdb.css and jquery-ui.css stylesheets to the root directory of your website, i.e. accessible from http://your_website/bigsdb.css.
Copy the images directory to the root directory of your website, i.e. accessible from http://your_website/images.
Copy the contents of the conf directory to /etc/bigsdb/. Check the paths of helper applications and database names in the bigsdb.conf file and modify for your system.
Create a PostgreSQL database user called apache - this should not have any special priveleges. First you will need to log in as the postgres user:
sudo su postgres
Then use the createuser command to do this, e.g.
createuser apache
From the psql command line, set the apache user password:
psql ALTER ROLE apache WITH PASSWORD 'remote';
Create PostgreSQL databases called bigsdb_auth, bigsdb_prefs and bigsdb_refs using the scripts in the sql directory. Create the database using the createdb command and set up the tables using the psql command.
createdb bigsdb_auth psql -f auth.sql bigsdb_auth createdb bigsdb_prefs psql -f prefs.sql bigsdb_prefs createdb bigsdb_refs psql -f refs.sql bigsdb_refs
Create a writable temporary directory in the root of the web site called tmp, i.e. accessible from http://your_website/tmp.
Create a log file, bigsdb.log, in /var/log owned by the web server daemon, e.g.
touch /var/log/bigsdb.log chown www-data /var/log/bigsdb.log
(substitute www-data for the web daemon user).
PostgreSQL can be configured in many ways and how you do this will depend on your site requirements.
The following security settings will allow the appropriate users 'apache' and 'bigsdb' to access databases without allowing all logged in users full access. Only the UNIX users 'postgres' and 'webmaster' can log in to the databases as the Postgres user 'postgres'.
You will need to edit the pg_hba.conf and pg_ident.conf files. These are found somewhere like /etc/postgresql/9.1/main/
pg_hba.conf
# Database administrative login by UNIX sockets local all postgres ident map=mymap # TYPE DATABASE USER CIDR-ADDRESS METHOD # "local" is for Unix domain socket connections only local all all ident map=mymap # IPv4 local connections: host all all 127.0.0.1/32 md5 # IPv6 local connections: host all all ::1/128 md5
pg_ident.conf
# MAPNAME SYSTEM-USERNAME PG-USERNAME mymap postgres postgres mymap webmaster postgres mymap www-data apache mymap bigsdb bigsdb mymap bigsdb apache
You may also need to change some settings in the postgresql.conf file. As an example, a configuration for a machine with 16GB RAM, allowing connections from a separate web server may have the following configuration changes made:
listen_addresses = '*' max_connections = 200 shared_buffers = 1024Mb work_mem = 8Mb effective_cache_size = 8192Mb stats_temp_directory = '/dev/shm'
Setting stats_temp_directory to /dev/shm makes use of a ramdisk usually available on Debian or Ubuntu systems for frequently updated working files. This reduces a lot of unneccessary disk access.
See Tuning Your PostgreSQL Server for more details.
Restart PostgreSQL after any changes, e.g.
/etc/init.d/postgresql restart
Site-specific configuration files are located in /etc/bigsdb by default.
- :download:`bigsdb.conf <conf/bigsdb.conf>` - main configuration file
- :download:`logging.conf <conf/logging.conf>` - error logging settings. See log4perl project website for advanced configuration details.
To run plugins that require a long time to complete their analyses, an offline job manager has been developed. The plugin will save the parameters of a job to a job database and then provide a link to the job status page. An offline script, run frequently from CRON, will then process the job queue and update status and outputs via the job status page.
Create a 'bigsdb' UNIX user, e.g.:
sudo useradd -s /bin/sh bigsdb
As the postgres user, create a 'bigsdb' user and create a bigsdb_jobs database using the jobs.sql SQL file, e.g.:
createuser bigsdb [no need for special priveleges] createdb bigsdb_jobs psql -f jobs.sql bigsdb_jobs
From the psql command line, set the bigsdb user password::
psql ALTER ROLE bigsdb WITH PASSWORD 'bigsdb';
Set up the jobs parameters in the /etc/bigsdb/bigsdb.conf file, e.g.:
jobs_db=bigsdb_jobs max_load=8
The jobs script will not process a job if the server's load average (over the last minute) is higher than the max_load parameter. This should be set higher than the number of processor cores or you may find that jobs never run on a busy server. Setting it to double the number of cores is probably a good starting point.
Copy the job_logging.conf file to the /etc/bigsdb directory.
Set the script to run frequently (preferably every minute) from CRON. Note that CRON does not like '.' in executable filenames, so either rename the script to 'bigsjobs' or create a symlink and call that from CRON, e.g.:
copy bigsjobs.pl to /usr/local/bin sudo ln -s /usr/local/bin/bigsjobs.pl /usr/local/bin/bigsjobs
You should install xvfb, which is a virtual X server that may be required for third party applications called from plugins. This is required, for example, for calling splitstree4 from the Genome Comparator plugin.
Add the following to /etc/crontab::
* * * * * bigsdb xvfb-run -a /usr/local/bin/bigsjobs
(set to run every minute from the 'bigsdb' user account).
If you'd like to run this more frequently, e.g. every 30 seconds, multiple entries can be added to CRON with an appropriate sleep prior to running, e.g.:
* * * * * bigsdb xvfb-run -a /usr/local/bin/bigsjobs * * * * * bigsdb sleep 30;xvfb-run -a /usr/local/bin/bigsjobs
Create a log file, bigsdb_jobs.log, in /var/log owned by 'bigsdb', e.g.:
sudo touch /var/log/bigsdb_jobs.log sudo chown bigsdb /var/log/bigsdb_jobs.log
There are two temporary directories (one public, one private) which may accumulate temporary files over time. Some of these are deleted automatically when no longer required but some cannot be cleaned automatically since they are used to display results after clicking a link or to pass the database query between pages of results.
The easiest way to clean the temp directories is to run a cleaning script periodically, e.g. create a root-executable script in /etc/cron.hourly containing the following::
#!/bin/sh #Remove temp BIGSdb files from secure tmp folder older than 1 week. find /var/tmp/ -name '*BIGSdb_*' -type f -mmin +10080 -exec rm -f {} \; 2>/dev/null #Remove .jnlp files from web tree older than 1 day find /var/www/tmp/ -name '*.jnlp' -type f -mmin +1440 -exec rm -f {} \; 2>/dev/null #Remove other tmp files from web tree older than 1 week find /var/www/tmp/ -type f -mmin +10080 -exec rm -f {} \; 2>/dev/null
The preferences database stores user preferences for BIGSdb databases running on the site. Every user will have a globally unique identifier (guid) stored in this database along with a datestamp indicating the last access time. On public databases that do not require logging in, this guid is stored as a cookie on the user's computer. Databases that require logging in use a combination of database and username as the identifier. Over time, the preferences database can get quite large since every unique user will result in an entry in the database. Since many of these entries represent casual users, or even web indexing bots, they can be periodically cleaned out based on their last access time. A weekly CRON job can be set up to remove any entries older than a defined period. For example, the following line entered in /etc/crontab will remove the preferences for any user that has not accessed any database in the past 6 months (the script will run at 6pm every Sunday).
#Prevent prefs database getting too large 00 18 * * 0 postgres psql -c "DELETE FROM guid WHERE last_accessed < NOW() - INTERVAL '6 months'" bigsdb_prefs
Set the log file to auto rotate by adding a file called 'bigsdb' with the following contents to /etc/logrotate.d:
/var/log/bigsdb.log { weekly rotate 4 compress copytruncate missingok notifempty create 640 root adm } /var/log/bigsdb_jobs.log { weekly rotate 4 compress copytruncate missingok notifempty create 640 root adm }
Major version changes, e.g. 1.7 -> 1.8, indicate that there has been a change to the underlying database structure for one or more of the database types. Scripts to upgrade the database are provided in sql/upgrade and are named by the database type and version number. For example, to upgrade an isolate database (bigsdb_isolates) from version 1.7 to 1.8, log in as the postgres user and type:
psql -f isolatedb_v1.8.sql bigsdb_isolates
Upgrades are sequential, so to upgrade from a version earlier than the last major version you would need to upgrade to the intermediate version first, e.g. to go from 1.6 -> 1.8, requires upgrading to 1.7 first.
Minor version changes, e.g. 1.8.0 -> 1.8.1, have no modifications to the database structures. There will be changes to the Perl library modules and possibly to the contents of the Javascript directory, images directory and CSS files. The version number is stored with the bigsdb.pl script, so this should also be updated so that BIGSdb correctly reports its version.