- Detailed installation and configuration steps for CentOS/RHEL-flavored Linux
- Notes and tips about HTTP requests
- Configuring the use of HTTPS
Here are the steps I took to install and set up this service on a CentOS 7.7 system. (Note: all of the following commands are performed as root.)
-
Create a user account for the service on the host system. (E.g.,
corsproxy
.) On a CentOS 7 system, this can be done using the following command; note the use of the-k
argument to prevent copying default skeleton files to the home directory, because we will fill the home directory with something else in the next step.useradd -r -m -c "CORS proxy server" -k /dev/null corsproxy
-
Clone this git repository into the account directory on the host system:
cd /home/corsproxy git clone --recursive https://github.com/caltechlibrary/corsproxy.git .
-
Install the NodeJS dependencies in the
server
subdirectory:cd /home/corsproxy/server npm install cors-anywhere
-
Change the user and group of everything to match the proxy user's group:
cd /home/corsproxy chown -R corsproxy:corsproxy .
-
Create a directory in
/var/run
where the proxy user can write the process id file:mkdir /var/run/corsproxy chown corsproxy:corsproxy /var/run/corsproxy
-
Install the
rsyslogd
configuration file, and tellrsyslogd
to load it:cd /home/corsproxy/admin/system cp corsproxy-rsyslog.conf /etc/rsyslog.d/corsproxy.conf mkdir /var/log/corsproxy chown corsproxy:corsproxy /var/log/corsproxy systemctl restart rsyslog
-
Install the
systemd
script and tellsystemd
about it:cp corsproxy.service /etc/systemd/system/ systemctl daemon-reload
-
Install the
logrotate
script:cp corsproxy-logrotate.txt /etc/logrotate.d/corsproxy
Configure the CORS proxy server by copying the template configuration file to create config.sh
and then editing this config.sh
file to set the variable values as needed for your installation.
cd /home/corsproxy/admin
cp config.sh.template config.sh
# edit config.sh
The value of the variables RATELIMIT
and REQUIRED_HEADER
are the most important to set in order to help prevent abuse of the service. Information about them can be found in the config.sh.template
file. Note: the way that the restrictions on origins works is currently limited, in that hosts are restricted based on the value of the Origin
header in the HTTP request, not the actual host or IP address of source of the request. To block hosts by IP address ranges, configure your system's firewall appropriately (see next steps). See the discussion later below for more on this topic.
Check your firewall settings and make sure they permit connections to the port you configured. Specific instructions for doing this cannot be given here, as they depend very much on your firewall scheme. Also make sure to save this new configuration (the how again depends on your particular system), so that the new firewall configuration persists across reboots of your computer.
Now, at this point, everything is in place, and what remains is to tell the operating system to install the new service and start it up. Before going further, it may be helpful to open another window and do a tail -f /var/log/messages
to keep an eye for system messages.
-
Enable the new service:
systemctl enable corsproxy.service
-
Start the service:
systemctl start corsproxy.service
-
Check the status:
systemctl status corsproxy.service
If all goes well, a node
process should be running under the user credentials of corsproxy
. Log output should also appear in a new log file located at /var/log/corsproxy/corsproxy.log
, but it will also get printed to /var/log/messages
. If log output is only printed in /var/log/messages
, something has gone wrong.
If things do not go well, inspect the log messages and try to determine if there is a configuration problem or a permissions problem. One of the sources for permissions problems is SELinux: you may find that the service simply refuses to run due to file access errors even when you have checked the permissions of all the files and directories. If that happens, check the value of the variable SELINUX
in /etc/selinux/config
: if it is enforcing
, then you will have to take additional steps to let the system read the Corsproxy files and start the process, or else change the SELinux mode to permissive
. (The latter will reduce the security of the system, so beware.)
Here are some suggested steps to take to verify that the service is running:
- On the host computer running the proxy, after starting the proxy service, check that the proxy process is running (look for a
node
process owned by usercorsproxy
in the output ofps auxww
) and also check that something is listening on the desired port (for example, look at the output ofnetstat -at
). - Next, open a new terminal window,
ssh
into the server running the proxy, and runtail -f
on/var/log/messages
. Do the same for/var/log/corsproxy/corsproxy.log
in another window. - Now, try to connect to the proxy's landing page from a browser on your local computer, by visiting the top-level page on the host and port. For example, if your proxy is running on port 8080 of the computer responding to
x.org
, the proxy page would behttp://x.org:8080
(orhttps://x.org:8080
if you have configured the use of HTTPS as discussed below.) This landing page is not limited by the setting ofRATELIMIT
in the configuration file, so if you cannot access it, something else is wrong – perhaps the firewall settings on the server prevent access to that port from the outside.
The following are notes about some lessons learned.
A frustrating gotcha in testing JavaScript programs embedded in web pages is how web browsers handle CORS requests. In particular, suppose that you have some combination of JavaScript and HTML in a web page (such as for a single-page application, perhaps one using vue.js), and the JavaScript code makes requests to remote services with data payloads in the requests. These are the kind of requests that trigger CORS protections and probably the reason why you are interested in this CORS proxy.
Loading a local file is probably the most common way of testing your application during development. Here is the catch: browsers set the HTTP header Origin
to null
when HTTP requests come from HTML+JavaScript pages loaded from a local file. In other words, if the URL in your browser location bar begins with file://
, HTTP requests generated by JavaScript code in that pager will have Origin: null
when they reach the CORS proxy server. Since corsproxy
's RATELIMIT
setting uses the value of the Origin
header, the RATELIMIT
setting will not work in this situation or will end up causing the server to block your access.
Here are some suggestions for working around this:
- One approach is to set up a private copy of the proxy running on a computer that you control; then you can configure the firewall on the host computer to block access to the proxy port from any source other than your client computer.
- If you are the only one using the proxy during your development work, one solution is to adjust the firewall settings on the server running
corsproxy
to block access from anything other than your client computer. - Another approach, if you need to share the proxy server with other people or can't change the firewall for some reason, is to adjust the
RATELIMIT
setting in thecorsproxy
configuration so that it does not completely block access to unrecognized hosts. One way to do this is to rely on the rate limit for other origins (i.e., those controlled by the first two numbers in theRATELIMIT
value). Set it to something high enough that it does not impede your development workflow, but still low enough to prevent abuse by wannabe hackers doing port scans on your organization's computers.
Suppose that you are clever and work around the file://
limitation discussed above by starting a local HTTP server, perhaps using the one-line Python command
python3 -m http.server
and then opening a web browser window on http://localhost:8000/yourfilename.html
. Well done! This avoids Origin: null
in the HTTP headers. However, the resulting Origin
header will then have the value http://localhost:8000
, which is again not a good basis for setting the RATELIMIT
configuration variable in your CORS proxy server. As with the local file approach described above, solutions include adjusting the firewall configuration on the host computer to block anything other than the IP address of your client, or to set RATELIMIT
such that the default values (i.e., from hosts without designated Origin
values) allow some access from any client.
The REQUIRED_HEADER
setting in the configuration file can be used to identify a header that must be present in HTTP requests in order for proxy accesses to succeed. It should be a single header name, without a value.
For example,
REQUIRED_HEADER="x-proxy-cors"
The header string will be compared in a case-insensitive manner. Proxy requests that lack this HTTP header will be rejected. Add the header to the requests made by the network code in the client software you control.
It should be clear that this is a kind of security by obscurity approach. It is meant to limit proxying to software only you control. It has benefit only as long as you do not advertise the fact that your proxy looks for the header. (And note that revealing the nature of the header can happen accidentally via the client software that you write. Do not do things like hard-wire the header value into open-source software you put on GitHub, where sooner or later someone will find it.)
Corsproxy supports using HTTPS instead of HTTP. To do that, you need to set the relevant configuration variables in your config.sh
file to reference the key and certificate files needed by HTTPS.
If you do not already have a certificate for use with corsproxy, you can obtain one easily with Certbot. here are the steps to follow to set up corsproxy
with HTTPS on a CentOS system.
-
Follow the instructions given on the Certbot web page to generate and install the necessary files on your server. They will by default be placed in the directory
/etc/letsencrypt
. For example, if your host and domain are namedhostname.hostdomain.com
, then a number of files will be created in/etc/letsencrypt/live/hostname.hostdomain.com
and/etc/letsencrypt/archive/hostname.hostdomain.com
. -
Change the permissions on the new files in
/etc/letsencrypt
to allow thecorsproxy
process to read them, and also to be able to write in thearchive
subdirectory. This can be done as follows (here assuming that the process group name iscorsproxy
):chmod -R 0770 /etc/letsencrypt/archive/ chmod -R 0750 /etc/letsencrypt/live/ chgrp -R corsproxy /etc/letsencrypt/archive/ chgrp -R corsproxy /etc/letsencrypt/live
-
Edit the
config.sh
file for your copy ofcorsproxy
to set the values of theKEY_FILE
andCERT_FILE
variables. Continuing with the example ofhostname.hostdomain.com
, the values would be as follows:KEY_FILE="/etc/letsencrypt/live/hostname.hostdomain.com/privkey.pem" CERT_FILE="/etc/letsencrypt/live/hostname.hostdomain.com/fullchain.pem"
That should be enough. Now you can restart the corsproxy
server process, change your client's configuration to use https
instead of http
in the address for the proxy server, and try to connect through the proxy. Watch the log files for indicates of whether things are working or not.