Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash Tables #162

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from
Open

Conversation

lorenzo-mechbau
Copy link

@lorenzo-mechbau lorenzo-mechbau commented Aug 1, 2018

This is a first step towards addressing issue #112 for distributed matrix-vectors. In short, each process in a parallel implementation stores a local number of rows of the distributed matrix, but at the same time the global number of columns. The latter should be replaced by a local quantity as well. #112 has already been solved for elements, nodes and dofs. A separated pull request regarding these aspects will follow.

Before addressing the solver routines, an efficient look-up strategy should determine whether a global-numbered dof belongs to the local numbering in the processor (on-rank / off-rank). Two simple options can be devised: Each processor could store all the global dofs and immediately determine the rank-status (on/off) of each dof with high storage consumption. Alternatively, a list of global dofs and corresponding local numbers could be created, limited to the on-rank dofs. Storage costs would be lower, but a linear search through the list would entail relatively high compute costs.

Hash tables are a possible solution to these drawbacks: The paper attached (Fredman 1984) describes the optimal solution to build a hash table for static sets (i.e., for which the number of elements does not change). Given a set of n keys (integer numbers) and corresponding values (i.e., some information connected to them), the keys are distributed in a table of dimension O(n) and can be looked up in time O(1).

In the distributed matrix case, the keys are the the global dofs, the values the local numbering in the rank. Further information connected to the dofs can be added, since the current implementation allows for scalar or array, integer or real values stored into lists (lists.F90).

The type HashTableType is defined in types.F90 and used as a member of DistributedMatrixCMISSType (columnHashTable). The table is initialised in the subroutine DistributedMatrix_CMISSInitialise to contain the information of the mapping columnDomainMapping%LOCAL_TO_GLOBAL_MAP (an array with indices and entries corresponding to the local and global dof numbering). Such a mapping is the array of keys in the hash table. It is taken as input by the subroutine HashTable_ValuesSetAndInsert (implemented in hash_routines.F90), which creates the table according to the procedure described in Fredman 1984. The logical dummy argument in the subroutine indicates the creation of a new table if FALSE, if TRUE the addition of an array of keys and values to a pre-existing table. In this case, since the table is designed for static sets, a new key distribution must be computed.

The subroutine HashTable_GetKey performs the look-up for the desired key in time O(1) according to Fredman 1984. If the key is found, the corresponding value can then be retrieved with HashTable_GetValue. The hash table can be first created in connection with one list of values, but
arbitrarily many additional integer or real values can be added to the same set of keys with the subroutine HashTable_AdditionalValuesSet. After the key look-up, these can be recovered with the HashTable_GetValue subroutine as well. In the case of values of different type and size, the subroutine outputs all the values of the same type and the corresponding list number.

Example:
Key = 28
Values = (1,1,1) (1.56,1.56,1.56) (101.56,101.56) (101,101)

Once the key has been located, calling HashTable_GetValue with an integer array as output dummy argument gives the array of values
(1, 1, 1)
(101,101,0)*
and the list number array (1,4).

This pull request includes a simple test of table construction and key and value query. The test result can be observed by activating the diagnostics in the subroutine DistributedMatrix_CMISSInitialise from any simple example, e.g. Laplace_equation. In the hash table, the keys are the global dofs in each rank, the local numbering builds up the values. These have been manipulated to experiment with multi-dimensional and real values.

Possible issues:

  • Is any other example needed?
  • Fix the output for values of different dimensions (see the "meaningless" 0 at *)?
  • Allow for other types of values (logical, double precision, ...)?

fredman_1984.pdf

@abi-git-user
Copy link
Contributor

Can one of the admins verify this patch?

@hsorby
Copy link
Contributor

hsorby commented Aug 1, 2018

add to whitelist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants