New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memcached #58
Memcached #58
Conversation
Looks promising from a cursory glance! I've started looking at the patch series now but I can't promise to be very responsive during christmas...
I don't remember, of course, but my guess is no. I think that I simply inlined the function since it only was used in one place and because the master branch at the time had started using
I think that On the other hand:
Using
I think that we need a function that more or less does what
Have you considered adding the magic to the key (as part of the string or as part of the hash) instead of the value? Then it wouldn't be necessary to verify/prune results with the wrong format. ccache does similar things for the object hash already; see for instance the first
Since memcached support is opt-in, the memcached part of the test suite also needs to be opt-in in some way. |
Thanks for reviewing. Will take a look at the write functions, and rebase + squash the late bug fixes... When it comes to magic, I think that was only for the memcached storage. It does use the main hash. |
a439672
to
4dc540e
Compare
Did the rebase to "master" and the cleanup with regards to squashing bug fixes and commit messages. |
Have rebased to master, and addressed the cppcheck style issues. |
Avoid writing truncated files in cache, in case of any errors. Since we are writing cached files, they're already compressed.
@jrosdahl, all issues are addressed. Added a "put_data_in_cache". |
Looks good. Thanks! I have pushed the commits to a new I think that it would be a good idea to ask on the mailing list for other testers/comments before merging it to master. |
Sounds good! I wrote some "getting started" instructions (above), how to build it. Maybe use the same code as in https://ccache.samba.org/performance.html, |
Does this really need to be fully implemented as a pre-requisite? Or would it be sufficient to create an abstraction that synthesizes access to the constituents parts of the file and leave the choice of single/multi-file to the abstraction? |
@afbjorklund writes:
Yes, I'm painfully aware...
Not sure if I understand this. In what way is the configuration system a hurdle? Should we change something?
I'm not very familiar with the Apache module system, but I think that it would be overkill to be that flexible. I don't see a need for compiling backends statically into the main binary, for instance. Do you?
I'm less interested in having support for different compression algorithms. I think that there is a clear difference between compression and storage: We can choose a single compression algorithm and ccache will work fine for everybody, but I don't think that we can make a single opininated choice for storage backends.
Thank you.
Right. Even if we only ever will write one storage backend (memcached), I think that the code will benefit from a clean API instead of being sort of hardcoded.
Note that I'm talking about runtime dependencies, not buildtime dependencies. It is an installation dependency on libmemcached11 (not the -dev package) I want to avoid for the main ccache package (and linking statically with libmemcached is not a good option). The envisioned ccache-memcached package would only contain the memcached backend plugin, not the main program. |
@cleeland writes:
My tentative idea about the backend API is that it more or less only should contain simple key-value storage primitives. I don't want backends to be aware of which parts a result comprises or how/if they are compressed, etc. The current code stores several entries per result (.o, .stderr, .d, ...) on disk, so using a simple key-value API both for that and for memcached operations would mean that ccache would make several network calls per result, which isn't good. Compare this with @afbjorklund's implementation which combines all entries into one to avoid multiple memcached operations. This is why I thought that having #218 in place first would make things easier. Does this make sense? |
For configuration, I just meant that you have to do the gperf and the rest of it. Which is slightly more work than just adding a The main reason for compiling it into the main binary is performance, but I guess it would be a good idea to actually measure the time it would take to |
We used the Apache runtime for the Ganglia modules, which is why I came to think of it... e.g. https://github.com/ganglia/ganglia-modules-linux/blob/master/io/Makefile.am Then again it also uses https://apr.apache.org/docs/apr/1.6/group__apr__dso.html (glib would also work) |
Thanks for discussing the possibilities to evolve this branch. In the meantime, could you do a favor to merge master to this branch? |
I understand that you and others are using the dev/memcached branch in production, which makes me uncomfortable since the branch was meant for developing stuff, not to hold a production-ready and supported semi-permanent branch. I want to have freedom to make backward incompatible changes on any dev/* branch or just delete them and start over. At the same time I don't want to do you a disservice, but I also don't want to spend time on maintaining this branch since I have decided that I want to see another direction of the solution as described earlier. This solution will most likely be incompatible with the dev/memcached branch, and it will be done iteratively on master and/or other branches, while this branch will be deleted or left as it is until we are done cherry-picking code from the branch. |
ccache version 3.4.2
ccache version 3.4.3
As per the discussion above, I will change the old branch to only track The new branch |
Hi, I found this work from earlier mailing list archives :) and wanted to ask if this or newer direction of shared ccache database index would fit our use-case: we have a build farm with many recyclable nodes (their local storage tends to disappear), and a shared NFS server for persistent data, including a Sometimes, the NFS access seems like a bottleneck (e.g. local work on the server, such as I wonder if memcached running on each node (the NFS server as well as builders) could be a faster solution than filesystem queries over NFS to find if an object with needed checksum exists, while writes to the Hope the question makes sense :) Also I wonder if this removal of (just part of) filesystem-bound work, would help avoid the issue we have sometimes, when a worker is abruptly killed, and it leaves |
@jimklimov : yes, this was/is the idea of the memcached backend. there is no current support for a mixed memcached/nfs store such as the one you are mentioning, instead other solutions such as couchbase were recommended for persistent cache. But it should be possible to do a home-grown solution, either by synching the local cache to external backup storage or by using a memcached-download script. It also runs in a mixed mode, where files are still stored locally on each server - but pushed/pulled to the remote for usage by other servers. Instead of the alternative, without local disk - here called "memcached only" - where each server talks directly to the memcached cluster... It is also recommended to use a local proxy, to save each and every ccache from having to talk with servers on the network directly. Reference: There's no good docs on how to set up a cluster, just some sample Vagrantfile on multiple servers... |
Closing this PR now as it won't be merged in its current form. It will hopefully reappear in another form in the future. |
As noted above, this Due to the new storage backend, the new memcached structure will probably not be compatible with the old memcached structure used here - but more likely to be based on the "aggregated" storage format of #411. |
I moved the code that used to be in this branch, into a separate ccache fork and into a separate binary: https://github.com/afbjorklund/memccache I think that the C++ version is better served by the new API, and the http and redis backends in PR #676 The old code (in C) will follow |
…emcached port. This patch is not safe for WITH_CCACHE_BUILD support yet as that causes all ports to depend on devel/ccache. Enabling that patch would then cause the new devel/libmemcached dependency to require devel/ccache which is a cyclic dependency. The autoconf dependency also causes issues. Add a devel/ccache-memcached slave port that would allow a user to use the ccache+memcached package manually with ports without WITH_CCACHE_BUILD. This patch comes from ccache/ccache#58 and has been an ongoing effort over a few years to be merged into the mainline of ccache. Documenation for it can be found in the MANUAL file at: /usr/local/share/doc/ccache/MANUAL.txt Sponsored by: Dell EMC Isilon
Support for memcached as a distributed cache for ccache
This is the rebased and squashed version of memcached-ccache, memcached-review and memcached-only. It adds two new options, memcached_conf and memcached_only (and a read_only_memcached).
Based on tardyp:memcached-ccache code from 2013 (Pull Request #30)
See http://memcached.org for the cache server (you will also need libmemcached)
Some getting started instructions can be found here: GETTING_STARTED.md
Update: The above instructions will install everything but on localhost only.
Here is how to continue after that, by using MULTIPLE_SERVERS.md