Skip to content

This is my REALLY FAST implementation of a hash table in C, in 150 lines of code.

Notifications You must be signed in to change notification settings

exebook/hashdict.c

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hashdict.c

This is my REALLY FAST implementation of a hash table in C, in under 200 lines of code.

This is in fact a port of my hashdic previously written in C++ for jslike project (which is a var class making programming in C++ as easy as in JavaScript).

MIT licensed.

For some reason it is more than twice as fast on my benchmarks as the hash table used in Redis. But unlike Redis version of a hash table there is no incremental resize.

The hash function used is my adaptation of Meiyan/7zCRC, it is better than MurMur3.

Hash slot duplicates are stored as linked list node = node->next.

example

#include <sys/time.h>
#include <stdio.h>
#include "hashdict.h"

int main() {
	struct dictionary* dic = dic_new(0);
	
	dic_add(dic, "ABC", 3);
	*dic->value = 100;
	
	dic_add(dic, "DE", 2);
	*dic->value = 200;
	
	dic_add(dic, "HJKL", 4);
	*dic->value = 300;
	
	if (dic_find(dic, "ABC", 3))
		printf("ABC found: %i\n", *dic->value);
	else printf("error\n");
	
	if (dic_find(dic, "DE", 2))
		printf("DE found: %i\n", *dic->value);
	else printf("error\n");
	
	if (dic_find(dic, "HJKL", 4))
		printf("HJKL found: %i\n", *dic->value);
	else printf("error\n");
	
	dic_delete(dic);
	return 0;
}

dic_add()

Add a new key to the hash table.

Unlike most of implementations, you do NOT supply the value as the argument for the add() function. Instead after dic_add() returns, set the value like this: *mydic->value = <VALUE>.

int dic_add(struct dictionary* dic, void *key, int keyn);

Returns true(1) if the key was already in the table, otherwise false(0). In both cases you can change the associated value by referencing the *dic->value field after dic_add() returns.

dic_find()

Lookup the key in the hash table. Return true(1) if found, the you can get the value like this: myvalue = *dic->value.

int dic_find(struct dictionary* dic, void *key, int keyn);

dic_new()

Create the hash table.

struct dictionary* dic_new(int initial_size);

Set initial_size to 0 for the initial size of the table, which is 1024 items. Useful when you know how much keys you will store and want to preallocate, in which case use N/growth_treshold as the initial_size. growth_threshold is 2.0 by default.

dic_delete()

Delete the hash table and frees all occupied memory.

void dic_delete(struct dictionary* dic);

forEach()

Iterates over all keys in the table and calls the specified callback for each of them.

void dic_forEach(struct dictionary* dic, enumFunc f, void *user);

typedef int (*enumFunc)(void *key, int count, int *value, void *user);

tuning

growth (resize)

struct dictionary has tuning fields:

growth_threshold: when to resize, for example 0.5 means "if number of inserted keys is half of the table length then resize". Default: 2.0;

My experiments on English dictionary shows balanced performance/memory savings with 1.0 to 2.0.

growth_factor: grow the size of hash table by N. Suggested number is between 2 (conserve memory) and 10 (faster insertions).

The key is a combination of a pointer to bytes and a count of bytes.

value type

The value type is defined as int in the header file, you can redefine it to the required type, which could be char or void* or anything you need.

#define HASHDICT_VALUE_TYPE int

About

This is my REALLY FAST implementation of a hash table in C, in 150 lines of code.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages