Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can key lifetime be shortened to insert/get/destroy functions only? #50

Open
peter-scholtens opened this issue Apr 25, 2023 · 6 comments

Comments

@peter-scholtens
Copy link
Contributor

peter-scholtens commented Apr 25, 2023

As the (ri)stretto algorithm does not store/modify the key, it should also be able to only borrow it? The lifetime of this borrow should therefore only resemble or exceed the lifetime of the function calls of insert(), try_insert(), get(), get_mut() and remove(), but not the lifetime of the full cache, I assume. To simply demonstrate the idea I modified the synced example version to this code below where the key is a &str:

use std::time::Duration;
use stretto::Cache;

// key should live as long as function call while cache takes ownership of value
fn new_cache<'a>() -> Cache<&'a str, String> {
    let c: Cache<&str, String> = Cache::new(12960, 1e6 as i64).unwrap();
    c
}

fn main() {
    let r = rand::random::<u8>();
    let c = new_cache();

    {
        // create storage key from random number, and set a value with a cost of 1
        let store_key = format!("key1-{}", r);
        // set a value with a cost of 1
        c.insert(&store_key, "value1".to_string(), 1);
        // set a value with a cost of 1 and ttl
        c.insert_with_ttl("key2", "value2".to_string(), 1, Duration::from_secs(3));

        // wait for value to pass through buffers
        c.wait().unwrap();
    }

    // Create a search key
    let key1 = "key1";

//... rest is not modified ...

However, that fails with the following error:

24 |     }
   |     - `store_key` dropped here while still borrowed

The idea of this enhancement is that users of the cache do not need to copy their key str to a String possibly saving CPU processing time by only handling a non-mutable borrow. I assume this can be done by inserting a lifetime specifier in the function calls?

@al8n
Copy link
Owner

al8n commented Apr 25, 2023

Yes, we can only borrow it. We only need a key to calculate a hash value and then we forget the key.

@peter-scholtens
Copy link
Contributor Author

I don't understand you reply. With version b161a98 and the above example, I observe the problem as mentioned above: store_key is still borrowed by cache c which causes the compiler error. So, the algorithm can do that, but the current Rust code cannot. Shall I investigate a patch for this?

@al8n
Copy link
Owner

al8n commented Apr 25, 2023

I don't understand you reply. With version b161a98 and the above example, I observe the problem as mentioned above: store_key is still borrowed by cache c which causes the compiler error. So, the algorithm can do that, but the current Rust code cannot. Shall I investigate a patch for this?

Yeah, the current Rust code cannot do that. It needs some small changes. Any changes to solve this are welcome.

@peter-scholtens
Copy link
Contributor Author

peter-scholtens commented May 1, 2023

I tried several thing with lifetimes, reference to types etc., but the only working solution I could invent avoids mentioning the key type at the instantiation time of the cache, but sets the type at the calling get() or get_mut() statement. Otherwise it seems you either get a borrowed value does not live long enough or a doesn't have a size known at compile-time error when including the key type in the structure definition. The weird thing is that we now get another degree of freedom to use different key types for the same cache like e.g. combining c.insert::<str>( ... ) and c.insert::<u64>( .. ) in the same code is possible. However, as this results every time in a close to 100% cache miss, it is unlikely this misuse will propagate to real production code.

When studying the code, some some questions rose in my mind:

  1. No single time the function borrow() is called. It looks to me we can easily remove all this Borrow trait conditions of insertion keys K and query key Q?
  2. What's the reason for fn build_key( ... ) -> (u64, u64) and not using fn build_key( ... ) -> (u128)?

Some info I used from stackoverflow and also someone else who noticed that copying should be avoided for a fast response webproxy: "...it has to read it from the NGINX C struct, allocate a Lua string and then copy it...".

Below a working example, where the key is borrowed several times.

#![feature(default_free_fn)]

use std::collections::hash_map::DefaultHasher;
use std::fmt::Debug;
use std::hash::{Hash, Hasher};

// BorrowedKeyCache holds only one value, with a hashed key.
#[derive(Debug)]
struct BorrowedKeyCache<V> {
    stored_hkey: Option<u64>,
    stored_value: V,
}

impl<V: Debug + Default> BorrowedKeyCache<V> {
    /// Instanciate new, empty cache
    pub fn new() -> BorrowedKeyCache<V> {
        return BorrowedKeyCache {
            stored_hkey: None,
            stored_value: <V>::default(),
        };
    }

    /// Insert value V while borrowing key K
    pub fn insert<K>(&mut self, key: &K, value: V) -> bool
    where
        K: Hash + Eq + Debug + ?Sized,
    {
        println!("insert: {:?}, v: {:?}", key, value);
        let mut hasher = DefaultHasher::new();
        key.hash(&mut hasher);
        self.stored_hkey = Some(hasher.finish());
        self.stored_value = value;
        return true;
    }

    /// Query with key Q
    pub fn get<Q>(&self, key: &Q) -> Option<&V>
    where
        Q: Hash + Eq + Debug + ?Sized,
    {
        let mut hasher = DefaultHasher::new();
        key.hash(&mut hasher);
        if self.stored_hkey == Some(hasher.finish()) {
            println!("found:  {:?}", key);
            return Some(&self.stored_value);
        } else {
            println!("NOT found: {:?}", key);
            return None;
        }
    }

    /// Query and modify with key Q
    pub fn get_mut<Q>(&mut self, key: &Q) -> Option<&mut V>
    where
        Q: Hash + Eq + Debug + ?Sized,
    {
        let mut hasher = DefaultHasher::new();
        key.hash(&mut hasher);
        if self.stored_hkey == Some(hasher.finish()) {
            println!("found:  {:?}", key);
            return Some(&mut self.stored_value);
        } else {
            println!("NOT found: {:?}", key);
            return None;
        }
    }

    /// Remove a specific item, true if succesfull
    pub fn remove<Q>(&mut self, key: &Q) -> bool
    where
        Q: Hash,
    {
        let mut hasher = DefaultHasher::new();
        key.hash(&mut hasher);
        let mut destroy = false;
        if self.stored_hkey == Some(hasher.finish()) {
            destroy = true;
        }
        if destroy == true {
            self.stored_hkey = None;
            self.stored_value = <V>::default();
        }
        return destroy;
    }

    /// Fully clear cache
    pub fn clear(&mut self) {
        self.stored_hkey = None;
        self.stored_value = <V>::default();
    }

    /// Reports number of items
    pub fn len(&self) -> usize {
        if self.stored_hkey.is_some() {
            return 1;
        } else {
            return 0;
        };
    }
}

fn main() {
    // Create our web cookie cache:
    let mut c: BorrowedKeyCache<String> = BorrowedKeyCache::new();
    {
        let initial_cookie = String::from("welcome123");
        c.insert::<str>(&initial_cookie.as_str(), "Initial user session".to_string());
        println!("Set cookie in HTTP return header: {}", initial_cookie);
        // Variable 'initial_cookie' can be dropped here.
    }

    // Check if a returning visitor with same key is recognized:
    {
        let returning_visitor_cookie = String::from("welcome123");
        println!(
            "LOG: key {:?} received, content in cache {:?}",
            returning_visitor_cookie,
            c.get(&returning_visitor_cookie)
        );
        // And modify, borrowing the key again,
        *c.get_mut(&returning_visitor_cookie).unwrap() = "Second session".to_string();
    }

    {
        // A user logs out
        let logout_user_cookie = String::from("goodbye890");
        println!("LOG: key {:?} received, logging out", logout_user_cookie);
        c.remove(&logout_user_cookie);
    }

    // All keys dropped, cache still filled.
    println!("Number of items in cache {}", c.len());
    c.clear();
    println!("Reduced to {} items.", c.len());
}

@al8n
Copy link
Owner

al8n commented May 4, 2023

  1. Yes, we can remove all borrow stuff.
  2. If we use u128, then we need some extra calculations, e.g. high 8 bits to store the result of first hash algorithm, low 8 bits to store the result from the second hash algorithm. Most of users will not need the second hash algorithm. FMO, I do not think the extra calculations will influence performance, but I have to say I am a bit lazy and just leave it (64,64).

@peter-scholtens
Copy link
Contributor Author

Okay so ..

  1. Done, see Removed unused Borrow trait requirements #53
  2. For now I agree that this is handy. In the very far future where 128-bit processor become more common and have a built-in hash function you can still modify it and increment the major number of the revision of the package.

This leaves the original question still open: will the code be modified to have a borrowed key at the insertion function, as demonstrated in #50 (comment) ? Possibly there exists a better implementation, but I didn't find it. As stretto is advertising to have no keys stored, its users should not be obliged to copy a key and let stretto destroy it (as in the current situation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants