New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction Number Generator #173

Open
martindsouza opened this Issue Nov 2, 2017 · 4 comments

Comments

Projects
None yet
4 participants
@martindsouza
Member

martindsouza commented Nov 2, 2017

Help Wanted!

Please read through this and need help to determine what to call these function(s)

  • What to name this function
  • What characters to exclude such as oOlL etc.

Problem

Sometimes business users want a transaction number. Ex: invoice number.

What most people do is create a sequence and then pad it with some 0s. Ex: 0081. The problem with it is what happens when we hit 9999? We'll then have 4 character and 5 character transaction numbers.

A simple solution is to covert a sequence to hex , thus giving transaction numbers like A08F. Though this helps, in high transaction systems adding 6 additional characters isn't enough or requires that transaction numbers are purposely long. See table below for stats

The following table shows how many values you can get for the number of characters:

Characters Base 10 Base 16 (Hex) Base 36 (0-Z)
1 10 16 36
2 100 256 1296
3 1000 4096 46656
4 10000 65536 1679616
5 100000 1048576 60466176
6 1000000 16777216 2176782336
7 10000000 268435456 78364164096
8 100000000 4294967296 2821109907456
9 1000000000 68719476736 101559956668416
10 10000000000 1099511627776 3656158440062976

Solution

The proposed solution is to allow for transaction numbers that cover 0-Z. I.e. 0,1,...9,A,B...Z for each character (base 36). We could expand this in the future to go beyond base 36 but would need to defined what the 37th character would look like.

The following query converts a number to base 36:

with 
  lvls as (
    select level lvl
    from dual
    -- The <= logic will return the number of characters required for conversion
    connect by level <= ceil(log(:base, :x)) + decode(log(:base, :x), ceil(log(:base, :x)), 1,0)
  ),
  -- Alphabet 0..Z
  alphabet as (
    select 
      level-1 num,
      case
        when level-1 < 10 then to_char(level-1)
        else chr( ascii('A')+level-1-10)
      end letter
    from dual
    connect by level <= :base
  ),
  -- Returns rows for all the decimal values for each character position
  my_data as (
    select
      to_char(:x, 'XXXXXX') hex_val, -- for testing
      lvl,
      remainder,
      quotient
    from lvls
    model
      return all rows
      dimension by (lvl)
      measures( 0 remainder, 0 quotient)
      rules 
      ( 
        -- Order matters here. I.e. s must come after t so s can "see" t
        quotient[lvl] = trunc(nvl(quotient[cv(lvl)-1], :x) / :base),
        remainder[lvl] = mod(nvl(quotient[cv(lvl)-1], :x), :base)
        
      )
  )
select
  to_char(:x, 'XXXXXX') hex_conv, -- to test for hex
  listagg(a.letter, '') within group (order by md.lvl desc) basex
from my_data md, alphabet a
where 1=1
  and md.remainder = a.num
-- For testing
--select *
--from my_data
;

PL/SQL version:

create or replace function basex (
  p_num in integer,
  p_base in integer)
  return varchar2
as
  l_return varchar2(255);
  l_quotient integer;
  l_remainder integer;
begin

  -- TODO mdsouza: checks that p_num > = 0 and p_base bwteen 10 and 36

  l_quotient := p_num;
  
  while l_quotient > 0 loop
    l_remainder := mod(l_quotient, p_base);
    l_quotient := trunc(l_quotient / p_base);

    if l_remainder < 10 then
      l_return := to_char(l_remainder) || l_return;
    else
      -- Subtract -10 since 0~10 covered in above
      l_return := chr(ascii('A') + l_remainder - 10) || l_return;
    end if;
  end loop;

  return l_return;
end basex;
/

Tasks

  • Maybe create a separate dec2base function (better name required)
  • Create function to get transaction number given a decimal.
  • Optional parameters could be: p_length and p_pad (either both required or both not required). Note: thinking of this we may require it since dec2base would handle the unpadded version
  • Create a future ticket to go beyond base 36 (low priority) to determine the extra characters
  • Check that p_base is between 10 and 36
  • check that p_x is a whole number > 0
  • See @connormcd example before. That combined with @dmcghan suggestion on twitter to remove characters like oOlL that may be hard to decipher.
  • Allow for user to pass in their alphabet/dictionary for character mapping and use substrings as per @connormcd If we do this need to check that p_base = length(p_char_mapping)
  • Alphabet needs to ensure no dups

@martindsouza martindsouza added this to the 1.1.0 milestone Nov 2, 2017

@jeffreykemp

This comment has been minimized.

Show comment
Hide comment
@jeffreykemp

jeffreykemp Nov 2, 2017

I would call it to_base36.

jeffreykemp commented Nov 2, 2017

I would call it to_base36.

@connormcd

This comment has been minimized.

Show comment
Hide comment
@connormcd

connormcd Nov 2, 2017

I found that enumerating the symbols in advance, and exchanging the MOD for subtraction gives a little perf boost. Around 15% on my machine.

create or replace function basex2 (
  p_num in integer,
  p_base in integer)
  return varchar2
as
  l_symbols      varchar2(64) := '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';
  l_return       varchar2(255);
  l_quotient     integer := p_num;
  l_trunced_part integer;
  l_remainder    integer;
begin

  while l_quotient > 0 loop
    l_trunced_part := trunc(l_quotient / p_base);
    l_remainder    := l_quotient - l_trunced_part*p_base;
    l_quotient     := l_trunced_part;

    l_return := substr(l_symbols,l_remainder+1,1) || l_return;
  end loop;

  return l_return;
end;

connormcd commented Nov 2, 2017

I found that enumerating the symbols in advance, and exchanging the MOD for subtraction gives a little perf boost. Around 15% on my machine.

create or replace function basex2 (
  p_num in integer,
  p_base in integer)
  return varchar2
as
  l_symbols      varchar2(64) := '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';
  l_return       varchar2(255);
  l_quotient     integer := p_num;
  l_trunced_part integer;
  l_remainder    integer;
begin

  while l_quotient > 0 loop
    l_trunced_part := trunc(l_quotient / p_base);
    l_remainder    := l_quotient - l_trunced_part*p_base;
    l_quotient     := l_trunced_part;

    l_return := substr(l_symbols,l_remainder+1,1) || l_return;
  end loop;

  return l_return;
end;
@zhudock

This comment has been minimized.

Show comment
Hide comment
@zhudock

zhudock Nov 7, 2017

Contributor

#67 and #128 both reference base conversion as well. I realize the goal of this issue is to add some additional functionality on top the base conversion, but we should avoid duplicating any code.

Contributor

zhudock commented Nov 7, 2017

#67 and #128 both reference base conversion as well. I realize the goal of this issue is to add some additional functionality on top the base conversion, but we should avoid duplicating any code.

@zhudock

This comment has been minimized.

Show comment
Hide comment
@zhudock

zhudock Nov 7, 2017

Contributor

In regards to @dmcghan suggestion on Twitter, this file from PWGen has the list of ambiguous characters they use.

http://pwgen.cvs.sourceforge.net/viewvc/pwgen/src/pw_rand.c?view=markup

const char *pw_ambiguous = "B8G6I1l0OQDS5Z2";

I'd suggest to add lowercase oisz, as well but excluding everything from that list may be a bit too aggressive.

Contributor

zhudock commented Nov 7, 2017

In regards to @dmcghan suggestion on Twitter, this file from PWGen has the list of ambiguous characters they use.

http://pwgen.cvs.sourceforge.net/viewvc/pwgen/src/pw_rand.c?view=markup

const char *pw_ambiguous = "B8G6I1l0OQDS5Z2";

I'd suggest to add lowercase oisz, as well but excluding everything from that list may be a bit too aggressive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment