# Computing digests locally

The `refget` Python package includes general-purpose functions for computing GA4GH-style digests. These functions can be used to compute digests of sequences or sequence collections.


Show some results for sequence digests:

In [3]:
from refget import sha512t24u_digest, fasta_to_digest, fasta_to_seqcol_dict, fasta_to_seq_digests

In [13]:
sha512t24u_digest('GGAA')

'YBbVX0dLKG1ieEDCiMmkrTZFt_Z5Vdaj'

You can also compute a top-level (level 0) digest for a FASTA file like this:

In [5]:
fasta_to_digest('../../../test_fasta/base.fa')

'XZlrcEGi6mlopZ2uD8ObHkQB1d0oDwKk'

If you want to get the complete level 2 representation of the sequence collection from the fasta file, use the `fasta_to_seqcol` function:

In [7]:
fasta_to_seqcol_dict('../../../test_fasta/base.fa')


{'lengths': [8, 4, 4],
 'names': ['chrX', 'chr1', 'chr2'],
 'sequences': ['SQ.iYtREV555dUFKg2_agSJW6suquUyPpMw',
  'SQ.YBbVX0dLKG1ieEDCiMmkrTZFt_Z5Vdaj',
  'SQ.AcLxtBuKEPk_7PGE_H4dGElwZHCujwH6'],
 'sorted_name_length_pairs': [b'{"length":4,"name":"chr1"}',
  b'{"length":4,"name":"chr2"}',
  b'{"length":8,"name":"chrX"}'],
 'sorted_sequences': ['SQ.iYtREV555dUFKg2_agSJW6suquUyPpMw',
  'SQ.YBbVX0dLKG1ieEDCiMmkrTZFt_Z5Vdaj',
  'SQ.AcLxtBuKEPk_7PGE_H4dGElwZHCujwH6']}

Or, if you want to use the lower-level function to just compute individual sequence digests for each sequence in the file, use the `fasta_to_seq_digests` function:

In [None]:
for x in fasta_to_seq_digests('../../../test_fasta/base.fa'):
    print(f"{x.metadata.name}\t{x.metadata.length}\t{x.metadata.sha512t24u}\t{x.metadata.md5}")

chrX	8	iYtREV555dUFKg2_agSJW6suquUyPpMw	5f63cfaa3ef61f88c9635fb9d18ec945
chr1	4	YBbVX0dLKG1ieEDCiMmkrTZFt_Z5Vdaj	31fc6ca291a32fb9df82b85e5f077e31
chr2	4	AcLxtBuKEPk_7PGE_H4dGElwZHCujwH6	92c6a56c9e9459d8a42b96f7884710bc
