New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server: caching improvements #152

Merged
merged 10 commits into from Nov 3, 2014

Conversation

Projects
None yet
2 participants
@mwmiller
Contributor

mwmiller commented Oct 29, 2014

  • Cache perl module status (DDG/App::DuckPAN) for 4 hours, by default
  • Similarly cache all downloaded assets (including the HTML pages)
  • Add option --cachesec/-c to control cache lifetime. -c=0 is very similar to -f
  • Reorganize asset-holding attributes to make it easier to understand how and where to add them
  • Use a standardized way to pull from cache and concatenate (if necessary), including type-specific comments on whence they come
  • Path::Tiny objects pushed "closer to the top" for easier reasoning about the files
  • Error out (and delete cache) when file is not fully received
  • Add download progress bar for more human-friendly output.
  • Minor improvement to, and expanded use of, exit_with_msg
@moollaza

This comment has been minimized.

Show comment
Hide comment
@moollaza

moollaza Oct 29, 2014

Member

@mwmiller nicely done. What exactly was the problem? And how does this fix it? I don't fully understand what's happening differently looking at the diff

Member

moollaza commented Oct 29, 2014

@mwmiller nicely done. What exactly was the problem? And how does this fix it? I don't fully understand what's happening differently looking at the diff

@mwmiller

This comment has been minimized.

Show comment
Hide comment
@mwmiller

mwmiller Oct 30, 2014

Contributor

What exactly was the problem?

The short version is that ddh6 somehow has different stuff on its SERP than duckduckgo.com. This means we can't find a particular asset (notably, in this case, what becomes the page_template_files). This leads to a variety of problems.

This fix, then, just assumes we're OK with empty files in cases where they aren't included on the SERP. I'd like to put a little time into fixing this in a more general way.

Contributor

mwmiller commented Oct 30, 2014

What exactly was the problem?

The short version is that ddh6 somehow has different stuff on its SERP than duckduckgo.com. This means we can't find a particular asset (notably, in this case, what becomes the page_template_files). This leads to a variety of problems.

This fix, then, just assumes we're OK with empty files in cases where they aren't included on the SERP. I'd like to put a little time into fixing this in a more general way.

@moollaza

This comment has been minimized.

Show comment
Hide comment
@moollaza

moollaza Oct 30, 2014

Member

Ah. @mwmiller I see what the problem is. DDH6 is in dev mode. With the recent Grunt changes, the names of the dev versions of the required JS have changes. Base.js and Serp.js are the new names. Those need to be updated in the regex that check for page assets and I think the problem should be resolved if I'm correct.

Member

moollaza commented Oct 30, 2014

Ah. @mwmiller I see what the problem is. DDH6 is in dev mode. With the recent Grunt changes, the names of the dev versions of the required JS have changes. Base.js and Serp.js are the new names. Those need to be updated in the regex that check for page assets and I think the problem should be resolved if I'm correct.

@moollaza

This comment has been minimized.

Show comment
Hide comment
@moollaza
Member

moollaza commented Oct 30, 2014

@mwmiller

This comment has been minimized.

Show comment
Hide comment
@mwmiller

mwmiller Oct 30, 2014

Contributor

@moollaza I figured this was (more or less) what was happening, but I still think that the regex-based matching is pretty fragile.

Contributor

mwmiller commented Oct 30, 2014

@moollaza I figured this was (more or less) what was happening, but I still think that the regex-based matching is pretty fragile.

@moollaza

This comment has been minimized.

Show comment
Hide comment
@moollaza

moollaza Oct 31, 2014

Member

@mwmiller that last commit is great. Looks like you need to rebase though?

Member

moollaza commented Oct 31, 2014

@mwmiller that last commit is great. Looks like you need to rebase though?

@mwmiller

This comment has been minimized.

Show comment
Hide comment
@mwmiller

mwmiller Oct 31, 2014

Contributor

@moollaza I meant to tag this with WIP, too. I got distracted and haven't completed my testing.

Contributor

mwmiller commented Oct 31, 2014

@moollaza I meant to tag this with WIP, too. I got distracted and haven't completed my testing.

@moollaza

This comment has been minimized.

Show comment
Hide comment
@moollaza

moollaza Oct 31, 2014

Member

@mwmiller ah, no prob 👍

Member

moollaza commented Oct 31, 2014

@mwmiller ah, no prob 👍

@mwmiller mwmiller changed the title from Server: deal better with changing page structure. to Server: caching improvements Nov 1, 2014

@mwmiller

This comment has been minimized.

Show comment
Hide comment
@mwmiller

mwmiller Nov 1, 2014

Contributor

I am removing the WIP label from this, because I feel it is done. It could still use copious testing more environments. Assuming it works as well for everyone else as it does for me, it's a huge win for getting a working environment relatively quickly on a poor-quality internet connection.

In a few moments, I'll update the description to more completely indicate what is included.

Contributor

mwmiller commented Nov 1, 2014

I am removing the WIP label from this, because I feel it is done. It could still use copious testing more environments. Assuming it works as well for everyone else as it does for me, it's a huge win for getting a working environment relatively quickly on a poor-quality internet connection.

In a few moments, I'll update the description to more completely indicate what is included.

mwmiller added some commits Oct 29, 2014

Server: deal better with changing page structure.
This is still far too tied to a particular page layout, but it won't die
now.

Fixes #151.  (With improvements to come)
Server: clean up file cache loading somewhat.
This makes a more straight-forward path through the code to understand
what is happening.

Also more commenting on the files themselves, to make it easier to
review in-browser.
Server: further refactor cachefile handling.
This is still not _exactly_ what I want, but we're getting closer.
Server: further streamline caching structure.
This allows for a much better experience when working on a slow link.

- Allow caching of everything (including perl module status and HTML
  pages) for 4 hours, by default.
- Provide an option to change the cache lifetime (-c=0 is more or less
  equivalent to -f)
- Store un-changed files in cache, updating them just before preparing
  to serve.
- Always read from cached file, which helps ensure they will be
  reusable, even when the force is on.

Still a problem: country.json
Server: still better caching information.
- Make it possible to recognize paritally downloaded files, exiting
  and removing them from the cache.
- Add Term::ProgressBar for download tracking.
- Replace die with exit_with_msg.
- Update exit_with_msg for better visual on multi-line messages.
Server: do not trust things external things in the dist-cache
Only copy over files without external soruces (notably duckpan.js, at
present)... everything else is either gotten fresh or cache-checked.

I don't even know how these managed to get distributed to me, but ah,
well.
Server: clean up output a bit.
Also:

- Remove now incorrect comment.
- Simplify the external/dist logic a bit.
Config: clean up path separation.
This straightens out the directory structure, while also expanding the
application of path().
@mwmiller

This comment has been minimized.

Show comment
Hide comment
@mwmiller

mwmiller Nov 1, 2014

Contributor

@moollaza Getting to your actual point, now that I've gone off the deep-end and rewritten much of the caching. 😁

Do I want to add 'base' to the 'duckduck' and 'serp' to the 'duckpan' regexes?

Contributor

mwmiller commented Nov 1, 2014

@moollaza Getting to your actual point, now that I've gone off the deep-end and rewritten much of the caching. 😁

Do I want to add 'base' to the 'duckduck' and 'serp' to the 'duckpan' regexes?

@moollaza moollaza self-assigned this Nov 1, 2014

@moollaza

This comment has been minimized.

Show comment
Hide comment
@moollaza

moollaza Nov 2, 2014

Member

@mwmiller yes the dev files we need are base.js, duckpan.js and serp.js which are the same as d####.js, dpan####.js and g####.js respectively. We can remove duckduck.js, it no longer exists under that name.

Member

moollaza commented Nov 2, 2014

@mwmiller yes the dev files we need are base.js, duckpan.js and serp.js which are the same as d####.js, dpan####.js and g####.js respectively. We can remove duckduck.js, it no longer exists under that name.

mwmiller and others added some commits Nov 2, 2014

Server: store assets on a per-host basis.
If you're changing the hostname, you're probably looking for
different assets.

Also, look for new gruny-style filenames which addresses #151.
@moollaza

This comment has been minimized.

Show comment
Hide comment
@moollaza

moollaza Nov 3, 2014

Member

@mwmiller made a small update to your latest commit -- this should be good now 👍

Member

moollaza commented Nov 3, 2014

@mwmiller made a small update to your latest commit -- this should be good now 👍

@moollaza

This comment has been minimized.

Show comment
Hide comment
@moollaza

moollaza Nov 3, 2014

Member

Tested and looks good. @mwmiller thanks a lot for the tremendous improvements 😃

Member

moollaza commented Nov 3, 2014

Tested and looks good. @mwmiller thanks a lot for the tremendous improvements 😃

moollaza added a commit that referenced this pull request Nov 3, 2014

@moollaza moollaza merged commit 428d5a3 into master Nov 3, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details

@moollaza moollaza deleted the mwm/missing_files branch Nov 3, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment