Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Namespace Cache Typo & remove useless query #2582

Merged
merged 1 commit into from
Apr 5, 2016

Conversation

jonmifsud
Copy link
Member

First of all there's a big 'error' on the namespace which never matched as there was a typo.

Second if a hash exists; the namespace query is run for no reason as data is overwritten by the if statement underneath.

First of all there's a big 'error' on the namespace which never matched as there was a typo.

Second if a hash exists; the namespace query is run for no reason as data is overwritten by the if statement underneath.
@brendo
Copy link
Member

brendo commented Apr 5, 2016

Good find, very baad typo by me!

@brendo brendo merged commit 06e9f4d into symphonycms:2.7.x Apr 5, 2016
@brendo
Copy link
Member

brendo commented Apr 5, 2016

If there's another 2.6.x release planned nitriques this should be cherry picked

@nitriques
Copy link
Member

Thanks guys. Been really busy the past week. And I've turn 30 :P
I hope to get 2.7.0 out soon. If this is urgent, I could do another 2.6.x.

@twiro
Copy link
Contributor

twiro commented Apr 12, 2016

And I've turn 30 :P

Congrats! And all the best for your thirties :)

I hope to get 2.7.0 out soon.

May I ask how soon? I started working on a small interface feature I was missing in Symphony for years now and I would love to include it in the upcoming 2.7-release (as this might be the last minor/feature-update of the 2.x-branch).

But I'm only halfway there yet and won't find any time to work on it in the next 2 weeks... maybe I could post some details and a first rough branch in 3 weeks to get some feedback from you guys.

I might also need some help optimizing one or two database-calls from someone who is more experienced with MYSQL than I am... anyone volunteering to give me some assistance with that? Would be very much appreciated!

@munki-boy
Copy link

@nitriques congrats mate, hope it's a great year for you.

@michael-e
Copy link
Member

Lots of young people here... Congrats, @nitriques!

@nitriques
Copy link
Member

@twiro I've been dealing with a reduced tasked force and more work then ever at work for the past couple of weeks, so I can't give any timeframe. Hopefully, we've managed to find some good people to come to work with us so I'll be able to see things more clearly in the coming weeks. That being said, I do not like the idea of waiting for things to happen as this has been proven to not work out. Do continue to work on it as you can, without any rush to ship things. Versions number do not matter. As long as it's compatible we can ship it quick. I am glad to release things when people are happy with the changes! Let's do what we can and manage releases around that instead of the other way around.

@munki-boy @michael-e Thanks guys!

@nitriques
Copy link
Member

@brendo @michael-e Is there a reason why the cache does not work when passing both an hash AND namespace ?

I know we already addressed this, but I got bitten again by it while building the sri extension. Why can't we check both hash and namespace in the where clause again ?

@michael-e
Copy link
Member

Sorry, I have no idea regarding this caching stuff. Hope @brendo can shed light on it.

@nitriques
Copy link
Member

Thanks Michael. Do you agree that it's a weird behaviour ?

@michael-e
Copy link
Member

I don't even fully understand how it is intended to work. Is there any documentation on this matter?

@nitriques
Copy link
Member

I do not think so, sadly

@michael-e
Copy link
Member

So in order to say anything useful, I would have to dive into it... All I could do right now is ask stupid questions like:

  • What is your expectation regarding the namespace/hash matter?
  • What is the hash used for anyway? (Maybe: find the entry?)
  • If the hash is used to find the entry, why is the namespace not included in the hash?
  • Can this mechanism be safe regarding hash collisions? (The probablity of collisions depends on the length of the hashed string, of course.)

@nitriques
Copy link
Member

I'll try to provide answers!

What is your expectation regarding the namespace/hash matter?

  1. $hash + $namespace both not null -> do a hash AND namespace search
  2. $hash not null + $namespace null -> do a hash search
  3. $hash null + $namespace not null -> do a namespace search

Case 1 is not implemented right now and always returns null.

What is the hash used for anyway? (Maybe: find the entry?)

It's currently used as the primary key yes. The system is setup in order to support multiple cache providers, so that's why the behaviour is hard to grasp.

If the hash is used to find the entry, why is the namespace not included in the hash?

In my understanding, namespaces were added after the initial implementation. It was added to be able to clean a "group" of records, hence why it's isolated. It's a common need with reddis if I recall correctly.

Can this mechanism be safe regarding hash collisions? (The probablity of collisions depends on the length of the hashed string, of course.)

The key is currently limited to 32 chars, which is also a flaw (I was putting paths in there but changed to md5($path) due to this limitation). So not, it's not currently safe theoretically, but I've never encounter a md5 collision in my life.

@jonmifsud
Copy link
Member Author

Why can't we check both hash and namespace in the where clause again ?

Basically my understanding is that a hash is unique; regardless of the
namespace. So you do not need a namespace to generate a hash. Hashes are
generated via md5 or something of the sort which is usually up to the
extension/provider to decide on the source but that has to be generated
before any processing; so in case of data sources we use the param name +
param filters.

The namespace is there to simplify 'deleting' and 'obtaining' grouped cache
entries. For Example a namespace could be a name of a section/datasource
which is used to clear all the caches related to that section/datasource on
save through the backend. The namespace's purpose is not to ensure
uniqueness but rather to provide functionality to read/delete multiple
items in one go vs trying to generate all the hashes at random.

Thus whenever a query with both namespace and hash is sent; only the hash
is going to be used as that is considered unique. As far as I'm aware we
don't handle collisions they're pretty rare... though you could say that
the extension which would use the cache provider should handle this, as the
extension would usually send in the hash and data thus the generic class
has to be agnostic you can't know what data is expected on the other end.

On Sat, 23 Apr 2016 at 19:18 michael-e notifications@github.com wrote:

So in order to say anything useful, I would have to dive into it... All I
could do right now is ask stupid questions like:

  • What is your expectation regarding the namespace/hash matter?
  • What is the hash used for anyway? (Maybe: find the entry?)
  • If the hash is used to find the entry, why is the namespace not
    included in the hash?
  • Can this mechanism be safe regarding hash collisions? (The
    probablity of collisions depends on the length of the hashed string, of
    course.)


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#2582 (comment)

@nitriques
Copy link
Member

That would explain a lot. But, should not we consider checking only the namespace if hash is null and revert to hash if it's not null. At least, my code would look ok with the namespace being specified in every call (read, write, delete) ? I find it rather strange to get null while passing both values.

@jonmifsud
Copy link
Member Author

@nitriques my pull request should only check the namespace when the hash is null, as it was doing a fetch which was overwritten by another query (performance mainly) and btw my code still works with both namespace and hash specified in each call, at least there is no reason why it should not... what I just realised is that we delete the namespace if there is no data; which could possibly cause performance issues, if the hash is defined, as we only want the hash deleted and not the namespace.

@nitriques
Copy link
Member

Indeed.

@brendo
Copy link
Member

brendo commented Apr 24, 2016

$hash + $namespace both not null -> do a hash AND namespace search
$hash not null + $namespace null -> do a hash search
$hash null + $namespace not null -> do a namespace search

This sounds like the most logical behaviour to me

@nitriques
Copy link
Member

To me too! But I fear like it can't be supported on ALL cache providers (does reddis, memcache and al support $hash + $namespace search ?)

@jonmifsud
Copy link
Member Author

Memcache doesn't not by default, but a php driver can handle namespacing at
least that's a best practice from what I read. It's how I set up the driver
to work
On Mon, 25 Apr 2016 at 21:44, Nicolas Brassard notifications@github.com
wrote:

To me too! But I fear like it can't be supported on ALL cache providers
(does reddis, memcache and al support $hash + $namespace search ?)


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#2582 (comment)

@nitriques
Copy link
Member

@jonmifsud so it's safe to add the

$hash + $namespace both not null -> do a hash AND namespace search

bit ?

@michael-e michael-e mentioned this pull request Jul 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants