diff --git a/README.md b/README.md index 9cc7b315b88..ce4d92e673a 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ Dataverse ========== -Dataverse is an open source web application for sharing, citing, analyzing, and preserving research data (developed by the [Data Science team] (http://datascience.iq.harvard.edu/about-dataverse) at the [Institute for Quantitative Social Science] (http://iq.harvard.edu/)). +Dataverse is an open source web application for sharing, citing, analyzing, and preserving research data (developed by the [Data Science and Products team] (http://www.iq.harvard.edu/people/people/data-science-products) at the [Institute for Quantitative Social Science] (http://iq.harvard.edu/)). Institutions and organizations can choose to install the Dataverse software for their own use. In this case, the institution will be responsible for maintaining the application; installing upgrades, diff --git a/conf/httpd/conf.d/dataverse.conf b/conf/httpd/conf.d/dataverse.conf index 8a6c119aeb5..4d6ddc4639a 100644 --- a/conf/httpd/conf.d/dataverse.conf +++ b/conf/httpd/conf.d/dataverse.conf @@ -9,11 +9,11 @@ ProxyPassMatch ^/error-documents ! # pass everything else to Glassfish ProxyPass / ajp://localhost:8009/ - - AuthType shibboleth - ShibRequestSetting requireSession 1 - require valid-user - +# +# AuthType shibboleth +# ShibRequestSetting requireSession 1 +# require valid-user +# ErrorDocument 503 /error-documents/503.html Alias /error-documents /var/www/dataverse/error-documents diff --git a/conf/solr/4.6.0/schema.xml b/conf/solr/4.6.0/schema.xml index 10f9b07be5c..323429b62da 100644 --- a/conf/solr/4.6.0/schema.xml +++ b/conf/solr/4.6.0/schema.xml @@ -298,6 +298,8 @@ + + diff --git a/doc/sphinx-guides/source/_static/api/dataverse-complete.json b/doc/sphinx-guides/source/_static/api/dataverse-complete.json new file mode 100644 index 00000000000..d5e92d1f1fc --- /dev/null +++ b/doc/sphinx-guides/source/_static/api/dataverse-complete.json @@ -0,0 +1,15 @@ +{ + "name": "Scientific Research", + "alias": "science", + "dataverseContacts": [ + { + "contactEmail": "pi@example.edu" + }, + { + "contactEmail": "student@example.edu" + } + ], + "affiliation": "Scientific Research University", + "description": "We do all the science.", + "dataverseType": "LABORATORY" +} diff --git a/doc/sphinx-guides/source/_static/api/dataverse-minimal.json b/doc/sphinx-guides/source/_static/api/dataverse-minimal.json new file mode 100644 index 00000000000..10086749825 --- /dev/null +++ b/doc/sphinx-guides/source/_static/api/dataverse-minimal.json @@ -0,0 +1,9 @@ +{ + "name": "Scientific Research", + "alias": "science", + "dataverseContacts": [ + { + "contactEmail": "pi@example.edu" + } + ] +} diff --git a/doc/sphinx-guides/source/_static/installation/files/etc/maintenance/HarvardShield_RGB.png b/doc/sphinx-guides/source/_static/installation/files/etc/maintenance/HarvardShield_RGB.png new file mode 100644 index 00000000000..f8fd03c2aa8 Binary files /dev/null and b/doc/sphinx-guides/source/_static/installation/files/etc/maintenance/HarvardShield_RGB.png differ diff --git a/doc/sphinx-guides/source/_static/installation/files/etc/maintenance/maintenance.xhtml b/doc/sphinx-guides/source/_static/installation/files/etc/maintenance/maintenance.xhtml new file mode 100644 index 00000000000..78b6100c9eb --- /dev/null +++ b/doc/sphinx-guides/source/_static/installation/files/etc/maintenance/maintenance.xhtml @@ -0,0 +1,118 @@ + + + + + Harvard Dataverse + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + +
+
+ +
+ Harvard Dataverse +
+
+ A collaboration with Harvard Library, Harvard University IT, and IQSS +
+
+
+
+
+
+

We apologize for the service interruption.

+

The Harvard Dataverse is currently undergoing maintenance. At this time both the application and APIs are not able to be used. However, the datasets stored in the Harvard Dataverse are fine and not impacted by this maintenance.

+

If you have any comments, questions or concerns, please reach out to support@dataverse.org.

+
+
+ + + diff --git a/doc/sphinx-guides/source/_static/installation/files/etc/selinux/targeted/src/policy/domains/misc/shibboleth.te b/doc/sphinx-guides/source/_static/installation/files/etc/selinux/targeted/src/policy/domains/misc/shibboleth.te new file mode 100644 index 00000000000..ae78e8aa54f --- /dev/null +++ b/doc/sphinx-guides/source/_static/installation/files/etc/selinux/targeted/src/policy/domains/misc/shibboleth.te @@ -0,0 +1,15 @@ +module shibboleth 1.0; + +require { + class file {open read}; + class sock_file write; + class unix_stream_socket connectto; + type httpd_t; + type initrc_t; + type var_run_t; + type var_t; +} + +allow httpd_t initrc_t:unix_stream_socket connectto; +allow httpd_t var_run_t:sock_file write; +allow httpd_t var_t:file {open read}; diff --git a/doc/sphinx-guides/source/api/client-libraries.rst b/doc/sphinx-guides/source/api/client-libraries.rst index f6fe1f980c7..17caee5d685 100755 --- a/doc/sphinx-guides/source/api/client-libraries.rst +++ b/doc/sphinx-guides/source/api/client-libraries.rst @@ -1,7 +1,7 @@ Client Libraries ================ -Currently there are client libraries for Python and R that can be used to develop against Dataverse APIs. +Currently there are client libraries for Python, R, and Java that can be used to develop against Dataverse APIs. We use the term "client library" on this page but "Dataverse SDK" (software development kit) is another way of describing these resources. They are designed to help developers express Dataverse concepts more easily in the languages listed below. For support on any of these client libraries, please consult each project's README. Because Dataverse is a SWORD server, additional client libraries exist for Java, Ruby, and PHP per the :doc:`/api/sword` page. @@ -10,7 +10,7 @@ Python https://github.com/IQSS/dataverse-client-python is the offical Python package for Dataverse APIs. -`Robert Liebowitz `_ from the `Center for Open Science `_ heads its development and the library is used to integrate the `Open Science Framework (OSF) `_ with Dataverse via an add-on which itself is open source and listed on the :doc:`/api/apps` page. +`Robert Liebowitz `_ created this library while at the `Center for Open Science (COS) `_ and the COS uses it to integrate the `Open Science Framework (OSF) `_ with Dataverse via an add-on which itself is open source and listed on the :doc:`/api/apps` page. R - @@ -18,3 +18,10 @@ R https://github.com/IQSS/dataverse-client-r is the official R package for Dataverse APIs. It was created by `Thomas Leeper `_ whose dataverse can be found at https://dataverse.harvard.edu/dataverse/leeper + +Java +---- + +https://github.com/IQSS/dataverse-client-java is the offical Java library for Dataverse APIs. + +`Richard Adams `_ from `ResearchSpace `_ created and maintains this library. diff --git a/doc/sphinx-guides/source/api/native-api.rst b/doc/sphinx-guides/source/api/native-api.rst index 8b686df66cd..c7f15c43e2b 100644 --- a/doc/sphinx-guides/source/api/native-api.rst +++ b/doc/sphinx-guides/source/api/native-api.rst @@ -12,11 +12,24 @@ Endpoints Dataverses ~~~~~~~~~~~ -Generates a new dataverse under ``$id``. Expects a json content describing the dataverse. +Generates a new dataverse under ``$id``. Expects a JSON content describing the dataverse, as in the example below. If ``$id`` is omitted, a root dataverse is created. ``$id`` can either be a dataverse id (long) or a dataverse alias (more robust). :: POST http://$SERVER/api/dataverses/$id?key=$apiKey +The following JSON example can be `downloaded <../_static/api/dataverse-complete.json>`_ and modified to create dataverses to suit your needs. The fields ``name``, ``alias``, and ``dataverseContacts`` are required. The controlled vocabulary for ``dataverseType`` is + +- ``JOURNALS`` +- ``LABORATORY`` +- ``ORGANIZATIONS_INSTITUTIONS`` +- ``RESEARCHERS`` +- ``RESEARCH_GROUP`` +- ``RESEARCH_PROJECTS`` +- ``TEACHING_COURSES`` +- ``UNCATEGORIZED`` + +.. literalinclude:: ../_static/api/dataverse-complete.json + View data about the dataverse identified by ``$id``. ``$id`` can be the id number of the dataverse, its alias, or the special value ``:root``. :: GET http://$SERVER/api/dataverses/$id @@ -70,12 +83,15 @@ Sets the metadata blocks of the dataverse. Makes the dataverse a metadatablock r Get whether the dataverse is a metadata block root, or does it uses its parent blocks:: - GET http://$SERVER/api/dataverses/$id/metadatablocks/:isRoot?key=$apiKey + GET http://$SERVER/api/dataverses/$id/metadatablocks/isRoot?key=$apiKey Set whether the dataverse is a metadata block root, or does it uses its parent blocks. Possible values are ``true`` and ``false`` (both are valid JSON expressions). :: - POST http://$SERVER/api/dataverses/$id/metadatablocks/:isRoot?key=$apiKey + PUT http://$SERVER/api/dataverses/$id/metadatablocks/isRoot?key=$apiKey + +.. note:: Previous endpoints ``GET http://$SERVER/api/dataverses/$id/metadatablocks/:isRoot?key=$apiKey`` and ``POST http://$SERVER/api/dataverses/$id/metadatablocks/:isRoot?key=$apiKey`` are deprecated, but supported. + Create a new dataset in dataverse ``id``. The post data is a Json object, containing the dataset fields and an initial dataset version, under the field of ``"datasetVersion"``. The initial versions version number will be set to ``1.0``, and its state will be set to ``DRAFT`` regardless of the content of the json object. Example json can be found at ``data/dataset-create-new.json``. :: @@ -131,8 +147,7 @@ Export the metadata of the current published version of a dataset in various for GET http://$SERVER/api/datasets/export?exporter=ddi&persistentId=$persistentId - Note: Supported exporters (export formats) are ddi, oai_ddi, dcterms, oai_dc, and dataverse_json. - +.. note:: Supported exporters (export formats) are ``ddi``, ``oai_ddi``, ``dcterms``, ``oai_dc``, and ``dataverse_json``. Lists all the file metadata, for the given dataset and version:: @@ -268,6 +283,14 @@ Management of Shibboleth groups via API is documented in the :doc:`/installation Info ~~~~ +Get the Dataverse version. The response contains the version and build numbers:: + + GET http://$SERVER/api/info/version + +Get the server name. This is useful when a Dataverse system is composed of multiple Java EE servers behind a load balancer:: + + GET http://$SERVER/api/info/server + For now, only the value for the ``:DatasetPublishPopupCustomText`` setting from the :doc:`/installation/config` section of the Installation Guide is exposed:: GET http://$SERVER/api/info/settings/:DatasetPublishPopupCustomText @@ -286,7 +309,7 @@ Return data about the block whose ``identifier`` is passed. ``identifier`` can e Admin ~~~~~~~~~~~~~~~~ -This is the administrative part of the API. It is probably a good idea to block it before allowing public access to a Dataverse installation. Blocking can be done using settings. See the ``post-install-api-block.sh`` script in the ``scripts/api`` folder for details. +This is the administrative part of the API. For security reasons, it is absolutely essential that you block it before allowing public access to a Dataverse installation. Blocking can be done using settings. See the ``post-install-api-block.sh`` script in the ``scripts/api`` folder for details. See also "Blocking API Endpoints" under "Securing Your Installation" in the :doc:`/installation/config` section of the Installation Guide. List all settings:: @@ -374,6 +397,18 @@ List all role assignments of a role assignee (i.e. a user or a group):: Note that ``identifier`` can contain slashes (e.g. ``&ip/localhost-users``). +List permissions a user (based on API Token used) has on a dataverse or dataset:: + + GET http://$SERVER/api/admin/permissions/$identifier + +The ``$identifier`` can be a dataverse alias or database id or a dataset persistent ID or database id. + +List a role assignee (i.e. a user or a group):: + + GET http://$SERVER/api/admin/assignee/$identifier + +The ``$identifier`` should start with an ``@`` if it's a user. Groups start with ``&``. "Built in" users and groups start with ``:``. Private URL users start with ``#``. + IpGroups ^^^^^^^^ diff --git a/doc/sphinx-guides/source/conf.py b/doc/sphinx-guides/source/conf.py index ca7b40d4dc6..4153e93a64f 100755 --- a/doc/sphinx-guides/source/conf.py +++ b/doc/sphinx-guides/source/conf.py @@ -64,9 +64,9 @@ # built documents. # # The short X.Y version. -version = '4.5.1' +version = '4.6' # The full version, including alpha/beta/rc tags. -release = '4.5.1' +release = '4.6' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. diff --git a/doc/sphinx-guides/source/developers/index.rst b/doc/sphinx-guides/source/developers/index.rst index a9daa12b16c..f16953a3265 100755 --- a/doc/sphinx-guides/source/developers/index.rst +++ b/doc/sphinx-guides/source/developers/index.rst @@ -20,3 +20,4 @@ Contents: making-releases tools unf/index + selinux diff --git a/doc/sphinx-guides/source/developers/selinux.rst b/doc/sphinx-guides/source/developers/selinux.rst new file mode 100644 index 00000000000..7062a7d1f01 --- /dev/null +++ b/doc/sphinx-guides/source/developers/selinux.rst @@ -0,0 +1,110 @@ +======= +SELinux +======= + +.. contents:: :local: + +Introduction +------------ + +The ``shibboleth.te`` file below that is mentioned in the :doc:`/installation/shibboleth` section of the Installation Guide was created on CentOS 6 as part of https://github.com/IQSS/dataverse/issues/3406 but may need to be revised for future versions of RHEL/CentOS. The file is versioned with the docs and can be found in the following location: + +``doc/sphinx-guides/source/_static/installation/files/etc/selinux/targeted/src/policy/domains/misc/shibboleth.te`` + +.. literalinclude:: ../_static/installation/files/etc/selinux/targeted/src/policy/domains/misc/shibboleth.te + :language: text + +This document is something of a survival guide for anyone who is tasked with updating this file. + +Development Environment +----------------------- + +In order to work on the ``shibboleth.te`` file you need to ``ssh`` into a RHEL or CentOS box running Shibboleth (instructions are in the :doc:`/installation/shibboleth` section of the Installation Guide) such as https://beta.dataverse.org or https://demo.dataverse.org that has all the commands below installed. As of this writing, the ``policycoreutils-python`` RPM was required. + +Recreating the shibboleth.te File +--------------------------------- + +If you're reading this page because someone has reported that Shibboleth doesn't work with SELinux anymore (due to an operating system upgrade, perhaps) you *could* start with the existing ``shibboleth.te`` file, but it is recommended that you create a new one instead to ensure that extra lines aren't included that are no longer necessary. + +The file you're recreating is called a Type Enforcement (TE) file, and you can read more about it at https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security-Enhanced_Linux/chap-Security-Enhanced_Linux-SELinux_Contexts.html + +The following doc may or may not be helpful to orient you: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security-Enhanced_Linux/sect-Security-Enhanced_Linux-Fixing_Problems-Allowing_Access_audit2allow.html + +Ensure that SELinux is Enforcing +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If ``getenforce`` returns anything other than ``Enforcing``, run ``setenforce Enforcing`` or otherwise configure SELinux by editing ``/etc/selinux/config`` and rebooting until SELinux is enforcing. + +Removing the Existing shibboleth.te Rules +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Use ``semodule -l | grep shibboleth`` to see if the ``shibboleth.te`` rules are already installed. Run ``semodule -r shibboleth`` to remove the module, if necessary. Now we're at square one (no custom rules) and ready to generate a new ``shibboleth.te`` file. + +Exercising SELinux denials +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As of this writing, there are two optional components of Dataverse that are known not to work with SELinux out of the box with SELinux: Shibboleth and rApache. + +We will be exercising SELinux denials with Shibboleth, and the SELinux-related issues are expected out the box: + +- Problems with the dropdown of institutions being created on the Login Page ("Internal Error - Failed to download metadata from /Shibboleth.sso/DiscoFeed."). +- Problems with the return trip after you've logged into HarvardKey or whatever ("shibsp::ListenerException" and "Cannot connect to shibd process, a site adminstrator should be notified."). + +In short, all you need to do is try to log in with Shibboleth and you'll see problems associated with SELinux being enabled. + +Stub out the new shibboleth.te file +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Iterate on the new ``shibboleth.te`` file wherever you like, such as the root user's home directory in the example below. Start by adding a ``module`` line like this: + +``echo 'module shibboleth 1.0;' > /root/shibboleth.te`` + +Note that a version is required and perhaps it should be changed, but we'll stick with ``1.0`` for now. The point is that the ``shibboleth.te`` file must begin with that "module" line or else the ``checkmodule`` command you'll need to run later will fail. Your file should look like this: + +.. code-block:: text + + module shibboleth 1.0; + # require lines go here + # allow lines go here + +Iteratively Use audit2allow to Add Rules and Test Your Change +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Now that ``shibboleth.te`` has been stubbed out, we will iteratively add lines to it from the output of piping SELinux Access Vector Cache (AVC) denial messages to ``audit2allow -r``. These errors are found in ``/var/log/audit/audit.log`` so tail the file as you attempt to log in to Shibboleth. + +``# tail -f /var/log/audit/audit.log | fgrep type=AVC`` + +You should see messages that look something like this: + +``type=AVC msg=audit(1476728970.378:271405): avc: denied { write } for pid=28548 comm="httpd" name="shibd.sock" dev=dm-2 ino=393300 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=unconfined_u:object_r:var_run_t:s0 tclass=sock_file`` + +Next, pipe these message to ``audit2allow -r`` like this: + +``echo 'type=AVC msg=audit(1476728970.378:271405): avc: denied { write } for pid=28548 comm="httpd" name="shibd.sock" dev=dm-2 ino=393300 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=unconfined_u:object_r:var_run_t:s0 tclass=sock_file' | audit2allow -r`` + +This will produce output like this: + +.. code-block:: text + + require { + type var_run_t; + type httpd_t; + class sock_file write; + } + + #============= httpd_t ============== + allow httpd_t var_run_t:sock_file write; + +Copy and paste this output into the ``shibboleth.te`` file you stubbed out above. Then, use the same ``checkmodule``, ``semodule_package``, and ``semodule`` commands documented in the :doc:`/installation/shibboleth` section of the Installation Guide on your file to activate the SELinux rules you're constructing. + +Once your updated SELinux rules are in place, try logging in with Shibboleth again. You should see a different AVC error. Pipe that error into ``audit2allow -r`` as well and put the resulting content into the ``shibboleth.te`` file you're constructing. As you do this, manually reformat the file using the following rules: + +- Put the ``require`` block at the top. +- Within the require block, sort the lines. +- Put the ``allow`` lines at the bottom and sort them. +- Where possible, avoid duplicate lines by combining operations such as ``open`` and ``read`` into ``{open read}``. +- Remove all comment lines. + +Keep iterating until it works and then create a pull request based on your updated file. Good luck! + +Many thanks to Bill Horka from IQSS for his assistance in explaining how to construct a SELinux Type Enforcement (TE) file! diff --git a/doc/sphinx-guides/source/developers/tools.rst b/doc/sphinx-guides/source/developers/tools.rst index 76d75edf4e4..cb9adc4ab5b 100755 --- a/doc/sphinx-guides/source/developers/tools.rst +++ b/doc/sphinx-guides/source/developers/tools.rst @@ -50,25 +50,16 @@ According to https://pagekite.net/support/free-for-foss/ PageKite (very generous Vagrant +++++++ -Vagrant allows you to spin up a virtual machine running Dataverse on -your development workstation. +Vagrant allows you to spin up a virtual machine running Dataverse on your development workstation. You'll need to install Vagrant from https://www.vagrantup.com and VirtualBox from https://www.virtualbox.org. + +We assume you have already cloned the repo from https://github.com/IQSS/dataverse as explained in the :doc:`/developers/dev-environment` section. From the root of the git repo, run ``vagrant up`` and eventually you should be able to reach an installation of Dataverse at http://localhost:8888 (or whatever forwarded_port indicates in the -Vagrantfile) - -The Vagrant environment can also be used for Shibboleth testing in -conjunction with PageKite configured like this: - -service_on = http:@kitename : localhost:8888 : @kitesecret - -service_on = https:@kitename : localhost:9999 : @kitesecret +Vagrantfile). -Please note that before running ``vagrant up`` for the first time, -you'll need to ensure that required software (GlassFish, Solr, etc.) -is available within Vagrant. If you type ``cd downloads`` and -``./download.sh`` the software should be properly downloaded. +Please note that running ``vagrant up`` for the first time should run the ``downloads/download.sh`` script for you to download required software such as Glassfish and Solr and any patches. However, these dependencies change over time so it's a place to look if ``vagrant up`` was working but later fails. MSV +++ diff --git a/doc/sphinx-guides/source/index.rst b/doc/sphinx-guides/source/index.rst index 8fd7056c032..fa398008743 100755 --- a/doc/sphinx-guides/source/index.rst +++ b/doc/sphinx-guides/source/index.rst @@ -3,10 +3,10 @@ You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. -Dataverse 4.5.1 Guides -====================== +Dataverse 4.6 Guides +==================== -These guides are for the most recent version of Dataverse. For the guides for **version 4.5** please go `here `_. +These guides are for the most recent version of Dataverse. For the guides for **version 4.5.1** please go `here `_. .. toctree:: :glob: diff --git a/doc/sphinx-guides/source/installation/administration.rst b/doc/sphinx-guides/source/installation/administration.rst index 59cf11652a3..201452a4c2f 100644 --- a/doc/sphinx-guides/source/installation/administration.rst +++ b/doc/sphinx-guides/source/installation/administration.rst @@ -67,6 +67,13 @@ https://github.com/IQSS/dataverse/issues/2595 contains some information on enabl There is a database table called ``actionlogrecord`` that captures events that may be of interest. See https://github.com/IQSS/dataverse/issues/2729 for more discussion around this table. +Maintenance +----------- + +When you have scheduled down time for your production servers, we provide a `sample maintenance page <../_static/installation/files/etc/maintenance/maintenance.xhtml>`_ for you to use. To download, right-click and select "Save Link As". + +The maintenance page is intended to be a static page served by Apache to provide users with a nicer, more informative experience when the site is unavailable. + User Administration ------------------- diff --git a/doc/sphinx-guides/source/installation/config.rst b/doc/sphinx-guides/source/installation/config.rst index 2ebed65457f..230e87808b7 100644 --- a/doc/sphinx-guides/source/installation/config.rst +++ b/doc/sphinx-guides/source/installation/config.rst @@ -16,9 +16,9 @@ Securing Your Installation Blocking API Endpoints ++++++++++++++++++++++ -The :doc:`/api/native-api` contains a useful but potentially dangerous API endpoint called "admin" that allows you to change system settings, make ordinary users into superusers, and more. There is a "test" API endpoint used for development and troubleshooting that has some potentially dangerous methods. The ``builtin-users`` endpoint lets people create a local/builtin user account if they know the ``BuiltinUsers.KEY`` value described below. +The :doc:`/api/native-api` contains a useful but potentially dangerous API endpoint called "admin" that allows you to change system settings, make ordinary users into superusers, and more. The ``builtin-users`` endpoint lets people create a local/builtin user account if they know the ``BuiltinUsers.KEY`` value described below. -By default, all APIs can be operated on remotely and without the need for any authentication. https://github.com/IQSS/dataverse/issues/1886 was opened to explore changing these defaults, but until then it is very important to block both the "admin" and "test" endpoint (and at least consider blocking ``builtin-users``). For details please see also the section on ``:BlockedApiPolicy`` below. +By default, all APIs can be operated on remotely and without the need for any authentication. https://github.com/IQSS/dataverse/issues/1886 was opened to explore changing these defaults, but until then it is very important to block both the "admin" endpoint (and at least consider blocking ``builtin-users``). For details please see also the section on ``:BlockedApiPolicy`` below. Forcing HTTPS +++++++++++++ @@ -119,11 +119,11 @@ It's also possible to change these values by stopping Glassfish, editing ``glass dataverse.fqdn ++++++++++++++ -If the Dataverse server has multiple DNS names, this option specifies the one to be used as the "official" host name. For example, you may want to have dataverse.foobar.edu, and not the less appealling server-123.socsci.foobar.edu to appear exclusively in all the registered global identifiers, Data Deposit API records, etc. +If the Dataverse server has multiple DNS names, this option specifies the one to be used as the "official" host name. For example, you may want to have dataverse.foobar.edu, and not the less appealling server-123.socsci.foobar.edu to appear exclusively in all the registered global identifiers, Data Deposit API records, etc. The password reset feature requires ``dataverse.fqdn`` to be configured. -| Do note that whenever the system needs to form a service URL, by default, it will be formed with ``https://`` and port 443. I.e., +| Do note that whenever the system needs to form a service URL, by default, it will be formed with ``https://`` and port 443. I.e., | ``https://{dataverse.fqdn}/`` | If that does not suit your setup, you can define an additional option, ``dataverse.siteUrl``, explained below. @@ -245,9 +245,9 @@ Out of the box, all API endpoints are completely open as mentioned in the sectio :BlockedApiEndpoints ++++++++++++++++++++ -A comma separated list of API endpoints to be blocked. For a production installation, "admin" and "test" should be blocked (and perhaps "builtin-users" as well), as mentioned in the section on security above: +A comma separated list of API endpoints to be blocked. For a production installation, "admin" should be blocked (and perhaps "builtin-users" as well), as mentioned in the section on security above: -``curl -X PUT -d "admin,test,builtin-users" http://localhost:8080/api/admin/settings/:BlockedApiEndpoints`` +``curl -X PUT -d "admin,builtin-users" http://localhost:8080/api/admin/settings/:BlockedApiEndpoints`` See the :doc:`/api/index` for a list of API endpoints. @@ -355,7 +355,7 @@ For dynamically adding information to the top of every page. For example, "For t :MaxFileUploadSizeInBytes +++++++++++++++++++++++++ -Set `MaxFileUploadSizeInBytes` to "2147483648", for example, to limit the size of files uploaded to 2 GB. +Set `MaxFileUploadSizeInBytes` to "2147483648", for example, to limit the size of files uploaded to 2 GB. Notes: - For SWORD, this size is limited by the Java Integer.MAX_VALUE of 2,147,483,647. (see: https://github.com/IQSS/dataverse/issues/2169) - If the MaxFileUploadSizeInBytes is NOT set, uploads, including SWORD may be of unlimited size. @@ -501,3 +501,10 @@ Host FQDN or URL of your Piwik instance before the ``/piwik.php``. Examples: or ``curl -X PUT -d hostname.domain.tld/stats http://localhost:8080/api/admin/settings/:PiwikAnalyticsHost`` + +:FileFixityChecksumAlgorithm +++++++++++++++++++++++++++++ + +Dataverse calculates checksums for uploaded files so that users can determine if their file was corrupted via upload or download. This is sometimes called "file fixity": https://en.wikipedia.org/wiki/File_Fixity + +The default checksum algorithm used is MD5 and should be sufficient for establishing file fixity. "SHA-1" is an experimental alternate value for this setting. diff --git a/doc/sphinx-guides/source/installation/prep.rst b/doc/sphinx-guides/source/installation/prep.rst index 3b828b5e715..c984db29533 100644 --- a/doc/sphinx-guides/source/installation/prep.rst +++ b/doc/sphinx-guides/source/installation/prep.rst @@ -16,7 +16,7 @@ Choose Your Own Installation Adventure Vagrant (for Testing Only) ++++++++++++++++++++++++++ -If you are looking to simply kick the tires on Dataverse and are familiar with Vagrant, running ``vagrant up`` after cloning the Dataverse repo **should** give you a working installation at http://localhost:8888 . This is one of the :doc:`/developers/tools` developers use to test the installation process but you're welcome to give it a shot. +If you are looking to simply kick the tires on installing Dataverse and are familiar with Vagrant, you are welcome to read through the "Vagrant" section of the :doc:`/developers/tools` section of the Developer Guide. Checking out a tagged release is recommended rather than running ``vagrant up`` on unreleased code. Pilot Installation ++++++++++++++++++ diff --git a/doc/sphinx-guides/source/installation/prerequisites.rst b/doc/sphinx-guides/source/installation/prerequisites.rst index 4a30c5b9a12..b6a59f536b0 100644 --- a/doc/sphinx-guides/source/installation/prerequisites.rst +++ b/doc/sphinx-guides/source/installation/prerequisites.rst @@ -190,4 +190,31 @@ Installing jq # chmod +x jq # jq --version +ImageMagick +----------- + +Dataverse uses `ImageMagick `_ to generate thumbnail previews of PDF files. This is an optional component, meaning that if you don't have ImageMagick installed, there will be no thumbnails for PDF files, in the search results and on the dataset pages; but everything else will be working. (Thumbnail previews for non-PDF image files are generated using standard Java libraries and do not require any special installation steps). + +Installing and configuring ImageMagick +====================================== + +On a Red Hat and similar Linux distributions, you can install ImageMagick with something like:: + + # yum install ImageMagick + +(most RedHat systems will have it pre-installed). +When installed using standard ``yum`` mechanism, above, the executable for the ImageMagick convert utility will be located at ``/usr/bin/convert``. No further configuration steps will then be required. + +On MacOS you can compile ImageMagick from sources, or use one of the popular installation frameworks, such as brew. + +If the installed location of the convert executable is different from ``/usr/bin/convert``, you will also need to specify it in your Glassfish configuration using the JVM option, below. For example:: + + -Ddataverse.path.imagemagick.convert=/opt/local/bin/convert + +(see the :doc:`config` section for more information on the JVM options) + + + Now that you have all the prerequisites in place, you can proceed to the :doc:`installation-main` section. + + diff --git a/doc/sphinx-guides/source/installation/r-rapache-tworavens.rst b/doc/sphinx-guides/source/installation/r-rapache-tworavens.rst index a88ffa114d2..a4dca0f2ec2 100644 --- a/doc/sphinx-guides/source/installation/r-rapache-tworavens.rst +++ b/doc/sphinx-guides/source/installation/r-rapache-tworavens.rst @@ -19,6 +19,9 @@ Disable SELinux on httpd: ``getenforce`` +(Note: a pull request to get rApache working with SELinux is welcome! Please see the :doc:`/developers/selinux` section of the Developer Guide to get started.) + + https strongly recommended; signed certificate (as opposed to self-signed) is recommended. Directory listing needs to be disabled on the web documents folder served by Apache: @@ -307,7 +310,11 @@ to user apache: g. restart httpd **************** - ``service httpd restart`` +III. Enable TwoRavens' Explore Button in Dataverse +-------------------------------------------------- + +The final step of TwoRavens' installation is to tell Dataverse to display its Explore button alongside tabular datafiles by executing the following command on the Glassfish host: +``curl -X PUT -d true http://localhost:8080/api/admin/settings/:TwoRavensTabularView`` diff --git a/doc/sphinx-guides/source/installation/shibboleth.rst b/doc/sphinx-guides/source/installation/shibboleth.rst index 7a93eb1eb52..cb3c390177d 100644 --- a/doc/sphinx-guides/source/installation/shibboleth.rst +++ b/doc/sphinx-guides/source/installation/shibboleth.rst @@ -201,10 +201,56 @@ attribute-map.xml By default, some attributes ``/etc/shibboleth/attribute-map.xml`` are commented out. Edit the file to enable them so that all the require attributes come through. You can download a `sample attribute-map.xml file <../_static/installation/files/etc/shibboleth/attribute-map.xml>`_. +Disable or Reconfigure SELinux +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +SELinux is set to "enforcing" by default on RHEL/CentOS, but unfortunately Shibboleth does not "just work" with SELinux. You have two options. You can disable SELinux or you can reconfigure SELinux to accommodate Shibboleth. + Disable SELinux -~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^ + +The first and easiest option is to set ``SELINUX=permisive`` in ``/etc/selinux/config`` and run ``setenforce permissive`` or otherwise disable SELinux to get Shibboleth to work. This is apparently what the Shibboleth project expects because their wiki page at https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPSELinux says, "At the present time, we do not support the SP in conjunction with SELinux, and at minimum we know that communication between the mod_shib and shibd components will fail if it's enabled. Other problems may also occur." + +Reconfigure SELinux to Accommodate Shibboleth +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The second (more involved) option is to use the ``checkmodule``, ``semodule_package``, and ``semodule`` tools to apply a local policy to make Shibboleth work with SELinux. Let's get started. + +Put Type Enforcement (TE) File in misc directory +```````````````````````````````````````````````` + +Copy and paste or download the `shibboleth.te <../_static/installation/files/etc/selinux/targeted/src/policy/domains/misc/shibboleth.te>`_ Type Enforcement (TE) file below and put it at ``/etc/selinux/targeted/src/policy/domains/misc/shibboleth.te``. + +.. literalinclude:: ../_static/installation/files/etc/selinux/targeted/src/policy/domains/misc/shibboleth.te + :language: text + +(If you would like to know where the ``shibboleth.te`` came from and how to hack on it, please see the :doc:`/developers/selinux` section of the Developer Guide. Pull requests are welcome!) + +Navigate to misc directory +`````````````````````````` + +``cd /etc/selinux/targeted/src/policy/domains/misc`` + +Run checkmodule +``````````````` + +``checkmodule -M -m -o shibboleth.mod shibboleth.te`` + +Run semodule_package +```````````````````` + +``semodule_package -o shibboleth.pp -m shibboleth.mod`` + +Silent is golden. No output is expected. + +Run semodule +```````````` + +``semodule -i shibboleth.pp`` + +Silent is golden. No output is expected. This will place a file in ``/etc/selinux/targeted/modules/active/modules/shibboleth.pp`` and include "shibboleth" in the output of ``semodule -l``. See the ``semodule`` man page if you ever want to remove or disable the module you just added. -You must set ``SELINUX=permisive`` in ``/etc/selinux/config`` and run ``setenforce permissive`` or otherwise disable SELinux for Shibboleth to work. "At the present time, we do not support the SP in conjunction with SELinux, and at minimum we know that communication between the mod_shib and shibd components will fail if it's enabled. Other problems may also occur." -- https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPSELinux +Congrats! You've made the creator of http://stopdisablingselinux.com proud. :) Restart Apache and Shibboleth ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/sphinx-guides/source/user/dataverse-management.rst b/doc/sphinx-guides/source/user/dataverse-management.rst index bac42305119..b3a1d7e95f1 100755 --- a/doc/sphinx-guides/source/user/dataverse-management.rst +++ b/doc/sphinx-guides/source/user/dataverse-management.rst @@ -12,7 +12,7 @@ to manage the settings described in this guide. Create a Dataverse (Within the "Root" Dataverse) =================================================== -Creating a dataverse is easy but first you must be a registered user (see Create Account). +Creating a dataverse is easy but first you must be a registered user (see :doc:`/user/account`). #. Once you are logged in click on the "Add Data" button and in the dropdown menu select "New Dataverse". #. Once on the "New Dataverse" page fill in the following fields: @@ -22,7 +22,7 @@ Creating a dataverse is easy but first you must be a registered user (see Create * **Affiliation**: Add any Affiliation that can be associated to this particular dataverse (e.g., project name, institute name, department name, journal name, etc). This is automatically filled out if you have added an affiliation for your user account. * **Description**: Provide a description of this dataverse. This will display on the home page of your dataverse and in the search result list. The description field supports certain HTML tags (, ,
,
, , ,
,
,
, ,
,

-

, , , ,
  • ,
      ,

      ,

      , , , , , ,