Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis of current SCL3 Apache config, and possible conversion to Python #180

Closed
bookshelfdave opened this issue Apr 19, 2017 · 16 comments
Closed

Comments

@bookshelfdave
Copy link
Contributor

bookshelfdave commented Apr 19, 2017

Parent issue

Once we have the SCL3 Apache config, determine:

  • the viability of converting this config to Python code
  • the amount of effort required
  • technical debt carried forward

I'll need more info on the Python libs that we use for similar tasks.

@bookshelfdave
Copy link
Contributor Author

bookshelfdave commented Apr 21, 2017

I spent a day with the existing SCL3 Apache config.

Here are some crude (low level) notes on my investigation: https://gist.github.com/metadave/f7f84d62e9aac10ca1d116e57b40a0f0

Summary

The existing Apache config is complex, running it in Docker and Kubernetes has proven to be extremely difficult. Even running a partial Apache config for redirects and rewrites presents several technical challenges, such as config conversion to support our AWS infrastructure, module compilation against older version of Apache (2.2), and lots of table flipping.

I recommend migrating the important redirects and rewrites to Django/Python, as I believe this path will take less time than using the existing Apache config running in AWS (either running in Kubernetes OR standalone on EC2 instances). We already have a Jenkins multi branch pipeline that pushes our demos out to Kubernetes, so we won't need to bring the SCL3 deployment scripts/tooling over to AWS/Kubernetes.

Web/App tiers

The existing Apache config runs both Apache and Django on the same instance. This makes packaging to run in Docker significantly more difficult, as we'd either have to run Apache and Django/Kuma from a single Docker image (resulting in a compilation nightmare and huge images), or setup named pipes between multiple containers so Apache can connect to Django. Kubernetes Services and Deployments do this for us "for free".

As Kuma is already running in demo mode in Kubernetes, we can take advantage of the new setup and either have Django handle all incoming requests (w/ ELB terminating TLS), or run nginx pods to do some more web lifting for us.

You can see my attempt to compile some of the additional Apache modules in the context of a new Docker image here.

Rewrites and Redirects

I recommend implementing redirects/rewrites in Python and/or nginx to keep things simple.

@pmac is planning to implement some common redirect functionality in Bedrock that we can leverage.

EDIT: @pmac has released this lib: https://github.com/pmac/django-redirect-urls
Grepping through the Apache config, these are the rewrites and redirects:

Rewrites:

mozilla/defaults.conf
10:    RewriteEngine On
11:    RewriteCond %{REQUEST_URI} !^/server-status$
12:    RewriteCond %{REQUEST_URI} !^/server-info$
13:    RewriteCond %{REQUEST_URI} !^/apc.php$
14:    RewriteRule .* - [F]

mozilla/domains/developer.cdn.mozilla.net.conf
26:    RewriteEngine on
27:    RewriteRule ^/media/(redesign/)?img(.*) /static/img$2 [L,R=301]

mozilla/domains/developer.mozilla.org.conf
22:    RewriteEngine On
25:    RewriteCond %{SERVER_NAME} ^developer\.mozilla\.com$
26:    RewriteRule (.*) http://developer.mozilla.org$1 [R=301,L]
66:    RewriteRule ^/media/(redesign/)?css/(.*)-min.css$ /static/build/styles/$2.css [L,R=301]
67:    RewriteRule ^/media/(redesign/)?js/(.*)-min.js$ /static/build/js/$2.js [L,R=301]
69:    RewriteRule ^/media/(redesign/)?img(.*) /static/img$2 [L,R=301]
70:    RewriteRule ^/media/(redesign/)?css(.*) /static/styles$2 [L,R=301]
71:    RewriteRule ^/media/(redesign/)?js(.*) /static/js$2 [L,R=301]
79:    RewriteRule ^/Special:UserLogin\??(.*) /index.php?title=Special:UserLogin&$1 [R]
82:    RewriteRule ^/media/(redesign/)?fonts(.*) /static/fonts$2 [L,R=301]
87:    RewriteRule ^(.*)//(.*)//(.*)$ $1_$2_$3 [R=301,L,NC]
88:    RewriteRule ^(.*)//(.*)$ $1_$2 [R=301,L,NC]

Redirects:

conf/httpd.conf
600:# Redirect allows you to tell clients about documents which used to exist in
604:# Redirect permanent /foo http://www.example.com/bar

conf/httpd.conf.rpmnew
590:# Redirect allows you to tell clients about documents which used to exist in
594:# Redirect permanent /foo http://www.example.com/bar

mozilla/domains/developer.cdn.mozilla.net.conf
32:    RedirectMatch 302 /media/uploads/demos/(.*)$ https://developer.mozilla.org/docs/Web/Demos_of_open_web_technologies/

mozilla/domains/developer.mozilla.org.conf
4:    Redirect permanent / https://developer.mozilla.org/
10:    Redirect permanent / https://developer.mozilla.org/
16:    Redirect permanent / https://developer.mozilla.org/
85:    RedirectMatch 302 /media/uploads/demos/(.*)$ https://developer.mozilla.org/docs/Web/Demos_of_open_web_technologies/

Serving static contant

Kuma server several non-(image, css, js) content from the /data directory. We can use a shared EFS mount to distribute this content to each running Kubernetes pod. Since EFS looks just like a standard filesystem, we can organize dev/stage/prod files into an appropriate directory structure.

Additional Apache modules

The existing Apache config uses some additional modules to handle some log rewriting for incoming Cloudflare request addresses, and communications from Apache to Python:

  • mod_cloudflare
  • mod_wsgi w/ Python 2.7

Misc Apache config

Below are some things that we may have to address when moving to the new config. I'm sure there are additional Important Details that need to be extracted from the Apache config as well.

Custom headers:
# Security headers 1297878
Header set X-Content-Type-Options "nosniff"
Header set X-XSS-Protection "1; mode=block"
Header set Strict-Transport-Security "max-age=63072000"
Additional file types:
AddType application/ogg .ogx
AddType audio/ogg .ogg .spx
AddType video/ogg .ogv
AddType application/json .json
AddEncoding x-gzip .jsonz
AddType application/octet-stream .dump
AddType x-java-archive .jar
AddType image/svg+xml .svg
AddType application/x-xpinstall .xpi
AddType video/webm .webm
AddType text/cache-manifest .appcache
Log format:
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
    ErrorLog "|/usr/sbin/rotatelogs -L /var/log/httpd/developer.cdn.mozilla.net/error_log /var/log/httpd/developer.cdn.mozilla.net/error_log_%Y-%m-%d-%H 3600 -0"
    CustomLog "|/usr/sbin/rotatelogs -L /var/log/httpd/developer.cdn.mozilla.net/access_log /var/log/httpd/developer.cdn.mozilla.net/access_%Y-%m-%d-%H 3600 -0" combined
File upload caching:
<Directory /data/www/developer.mozilla.org/kuma/media/uploads>
    AllowOverride All
    AddType application/x-data .data

    # Set far-future Expires headers for uploads. bug 842751
    ExpiresActive On
    ExpiresDefault "access plus 12 hours"
    ExpiresByType  text/css "access plus 12 hours"
    ExpiresByType  text/javascript "access plus 12 hours"
    ExpiresByType  image/png "access plus 12 hours"
    ExpiresByType  image/jpeg "access plus 12 hours"
    ExpiresByType  image/gif "access plus 12 hours"
    ExpiresByType  image/vnd.microsoft.icon "access plus 12 hours"
    ExpiresByType  video/webm "access plus 12 hours"
    ExpiresByType  video/ogg "access plus 12 hours"
    ExpiresByType  video/x-flv "access plus 12 hours"
    ExpiresByType  application/x-shockwave-flash "access plus 12 hours"
    FileETag MTime
</Directory>

@bookshelfdave
Copy link
Contributor Author

Additional aliases we may have to implement:

conf/httpd.conf.rpmnew:551:Alias /icons/ "/var/www/icons/"
conf/httpd.conf.rpmnew:576:ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"
conf/httpd.conf.rpmnew:855:Alias /error/ "/var/www/error/"
conf/httpd.conf:561:Alias /icons/ "/var/www/icons/"
conf/httpd.conf:586:ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"
conf/httpd.conf:873:Alias /error/ "/var/www/error/"
mozilla/domains/developer.mozilla.org.conf:21:    ServerAlias developer-local developer.mozilla.com
mozilla/domains/developer.mozilla.org.conf:72:    Alias /media /data/www/developer.mozilla.org/kuma/media
mozilla/domains/developer.mozilla.org.conf:73:    Alias /admin-media /data/www/developer.mozilla.org/kuma/vendor/src/django/django/contrib/admin/media
mozilla/domains/developer.mozilla.org.conf:75:    Alias /presentations /data/www/presentations
mozilla/domains/developer.mozilla.org.conf:76:    Alias /samples /data/www/samples
mozilla/domains/developer.mozilla.org.conf:77:    Alias /diagrams /data/www/diagrams
mozilla/domains/developer.mozilla.org.conf:90:    WSGIScriptAlias /mwsgi /data/www/developer.mozilla.org/kuma/wsgi/kuma.wsgi
mozilla/domains/developer.cdn.mozilla.net.conf:3:    ServerAlias developer-origin.cdn.mozilla.net
mozilla/domains/developer.cdn.mozilla.net.conf:28:    Alias /media /data/www/developer.mozilla.org/kuma/media
mozilla/domains/developer.cdn.mozilla.net.conf:29:    Alias /admin-media /data/www/developer.mozilla.org/kuma/vendor/src/django/django/contrib/admin/media

@bookshelfdave
Copy link
Contributor Author

I spent some time trying to get https://github.com/pmac/django-redirect-urls to work as a replacement for some of the redirects, but I could use ~30 minute (time-boxed!) assist from @escattone @pmac or @jgmize.

@escattone
Copy link
Contributor

Wow, thanks for all of this detailed work @metadave ! I would be happy to assist as much as I am able.

@jgmize
Copy link
Contributor

jgmize commented Apr 27, 2017

I'm available tomorrow, will ping you on IRC.

@bookshelfdave
Copy link
Contributor Author

related: discussion of serving legacy samples from EFS/Kuma here

@bookshelfdave bookshelfdave moved this from In Progress to Queued in [Deprecated] MozMEAO SRE May 2, 2017
@bookshelfdave bookshelfdave moved this from Queued to In Progress in [Deprecated] MozMEAO SRE May 2, 2017
@bookshelfdave
Copy link
Contributor Author

excluding the following rewrite rule, which may be leftover from dekiwiki:

RewriteRule ^/Special:UserLogin\??(.*) /index.php?title=Special:UserLogin&$1 [R]

https://developer.mozilla.org/index.php doesn't currently do anything.

https://github.com/mozilla/kumascript/blob/master/macros/NotFound.ejs#L16

@bookshelfdave
Copy link
Contributor Author

As noted in mdn/kuma#4215, this rewrite remains unimplemented:

RewriteCond %{SERVER_NAME} ^developer\.mozilla\.com$
RewriteRule (.*) http://developer.mozilla.org$1 [R=301,L]

@bookshelfdave
Copy link
Contributor Author

TODO:

@jwhitlock
Copy link
Contributor

@bookshelfdave
Copy link
Contributor Author

bookshelfdave commented May 4, 2017 via email

@bookshelfdave
Copy link
Contributor Author

Tracked for MDN in Bugzilla

@bookshelfdave
Copy link
Contributor Author

bookshelfdave commented May 8, 2017

Do we need to also implement rewrites from https://github.com/mozilla/kuma/blob/master/configs/htaccess ?

@jwhitlock
Copy link
Contributor

:sigh: Looks like it is active in production:

    DocumentRoot "/data/www/developer.mozilla.org/kuma/webroot"

    <Directory /data/www/developer.mozilla.org/kuma/webroot>
        Options +FollowSymLinks
        AllowOverride All
    </Directory>

I'll leave it up to you if you add more to this PR or finish this one and open a second.

@bookshelfdave
Copy link
Contributor Author

Some of those rules are a bit more complicated, I suppose I'll follow up in a later PR.

@bookshelfdave bookshelfdave moved this from In Progress to In Review in [Deprecated] MozMEAO SRE May 8, 2017
@bookshelfdave bookshelfdave moved this from In Review to Queued in [Deprecated] MozMEAO SRE May 9, 2017
@bookshelfdave
Copy link
Contributor Author

Closing this for now, followup rewrites will occur here

@bookshelfdave bookshelfdave moved this from Queued to In Review in [Deprecated] MozMEAO SRE May 9, 2017
@bookshelfdave bookshelfdave moved this from In Review to Complete in [Deprecated] MozMEAO SRE May 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

4 participants