# `clean_nums()`: Clean Different Types of Numbers

A wrapper of [the library standardizing different types of numbers](https://arthurdejong.org/python-stdnum/), which supports more than 180 types of numbers.

Validate and standardize the format of different types of numbers.

# Features

1. Automatically recognizing the type of input numbers. 

2. Reformat the input numbers.

3. If there are more than 1 inferred types for a input number, finding the best matched type according to user-specified keywords. 

4. If there is no matching types, then the input number should be invalid.

5. Transfer invalid numbers into `NaN`

6. Standardize null values

7. User can specify the output attributes they want. 

# Tentative design

In [None]:
def clean_nums(
    df: Union[pd.DataFrame, dd.DataFrame],
    column: str,
    input_format: str = "auto",
    output_format: str = "standard",
    format_keyword: Optional[str] = None, 
    inplace: bool = False,
    report: bool = True,
    progress: bool = True,
) -> pd.DataFrame:
    """
    Parameters
    ----------
    df
        A pandas or Dask DataFrame containing the data to be cleaned.
    column
        The name of the column containing language names.
    input_format
         -  auto
            Infer the input format
         -  NRT
            Número de Registre Tributari, Andorra tax number
         -  NIPT
            Numri i Identifikimit për Personin e Tatueshëm, Albanian VAT number
         -  CBU
            Clave Bancaria Uniforme, Argentine bank account number
         -  CUIT
            Código Único de Identificación Tributaria, Argentinian tax number
         -  DNI
            Documento Nacional de Identidad, Argentinian national identity nr.
         -  Austrian Company Register Numbers
         -  Postleitzahl
            Austrian postal code
         -  Abgabenkontonummer
            Austrian tax identification number
         -  UID
            Umsatzsteuer-Identifikationsnummer, Austrian VAT number
         -  VNR, SVNR, VSNR
            Versicherungsnummer, Austrian social security number
         -  ABN
            Australian Business Number
         -  ACN
            Australian Company Number
         -  TFN
            Australian Tax File Number
         -  Belgian IBAN
            International Bank Account Number
         -  BTW, TVA, NWSt, ondernemingsnummer
            Belgian enterprise number
         -  EGN
            ЕГН, Единен граждански номер, Bulgarian personal identity codes
         -  PNF
            ЛНЧ, Личен номер на чужденец, Bulgarian number of a foreigner
         -  VAT
            Идентификационен номер по ДДС, Bulgarian VAT number
         -  BIC
            ISO 9362 Business identifier codes
         -  Bitcoin address
         -  CNPJ
            Cadastro Nacional da Pessoa Jurídica, Brazilian company identifier
         -  CPF
            Cadastro de Pessoas Físicas, Brazilian national identifier
         -  УНП, UNP
            Учетный номер плательщика, the Belarus VAT number
         -  BN
            Canadian Business Number
         -  SIN
            Canadian Social Insurance Number
         -  CAS RN
            Chemical Abstracts Service Registry Number
         -  ESR, ISR, QR-reference
            reference number on Swiss payment slips
         -  Swiss social security number
            "Sozialversicherungsnummer"
         -  UID
            Unternehmens-Identifikationsnummer, Swiss business identifier
         -  VAT, MWST, TVA, IVA, TPV
            Mehrwertsteuernummer, the Swiss VAT number
         -  RUT
            Rol Único Tributario, Chilean national tax number
         -  RIC No.
            Chinese Resident Identity Card Number
         -  USCC
            Unified Social Credit Code, 统一社会信用代码, China tax number
         -  NIT
            Número De Identificación Tributaria, Colombian identity code
         -  CPF
            Cédula de Persona Física, Costa Rica physical person ID number
         -  CPJ
            Cédula de Persona Jurídica, Costa Rica tax number
         -  CR
            Cédula de Residencia, Costa Rica foreigners ID number
         -  NI
            Número de identidad, Cuban identity card numbers
         -  CUSIP number
            financial security identification number
         -  Αριθμός Εγγραφής Φ.Π.Α.
            Cypriot VAT number
         -  DIČ
            Daňové identifikační číslo, Czech VAT number
         -  RČ
            Rodné číslo, the Czech birth number
         -  Handelsregisternummer
            German company register number
         -  IdNr
            Steuerliche Identifikationsnummer, German personal tax number
         -  St.-Nr.
            Steuernummer, German tax number
         -  Ust ID Nr.
            Umsatzsteur Identifikationnummer, German VAT number
         -  Wertpapierkennnummer
            German securities identification code
         -  CPR
            personnummer, the Danish citizen number
         -  CVR
            Momsregistreringsnummer, Danish VAT number
         -  Cedula
            Dominican Republic national identification number
         -  NCF
            Números de Comprobante Fiscal, Dominican Republic receipt number
         -  RNC
            Registro Nacional del Contribuyente, Dominican Republic tax number
         -  EAN
            International Article Number
         -  CI
            Cédula de identidad, Ecuadorian personal identity code
         -  RUC
            Registro Único de Contribuyentes, Ecuadorian company tax number
         -  Isikukood
            Estonian Personcal ID number
         -  KMKR
            Käibemaksukohuslase, Estonian VAT number
         -  Registrikood
            Estonian organisation registration code
         -  CCC
            Código Cuenta Corriente, Spanish Bank Account Code
         -  CIF
            Código de Identificación Fiscal, Spanish company tax number
         -  CUPS
            Código Unificado de Punto de Suministro, Spanish meter point number
         -  DNI
            Documento Nacional de Identidad, Spanish personal identity codes
         -  Spanish IBAN
            International Bank Account Number
         -  NIE
            Número de Identificación de Extranjero, Spanish foreigner number
         -  NIF
            Número de Identificación Fiscal, Spanish VAT number
         -  Referencia Catastral
            Spanish real estate property id
         -  SEPA Identifier of the Creditor
            AT-02
         -  Euro banknote serial numbers
         -  EIC
            European Energy Identification Code
         -  NACE
            classification for businesses in the European Union
         -  VAT
            European Union VAT number
         -  ALV nro
            Arvonlisäveronumero, Finnish VAT number
         -  Finnish Association Identifier
         -  HETU
            Henkilötunnus, Finnish personal identity code
         -  Veronumero
            Finnish individual tax number
         -  Y-tunnus
            Finnish business identifier
         -  FIGI
            Financial Instrument Global Identifier
         -  NIF
            Numéro d'Immatriculation Fiscale, French tax identification number
         -  NIR
            French personal identification number
         -  SIREN
            a French company identification number
         -  SIRET
            a French company establishment identification number
         -  n° TVA
            taxe sur la valeur ajoutée, French VAT number
         -  NHS
            United Kingdom National Health Service patient identifier
         -  SEDOL number
            Stock Exchange Daily Official List number
         -  UPN
            English Unique Pupil Number
         -  UTR
            United Kingdom Unique Taxpayer Reference
         -  VAT
            United Kingdom
            and Isle of Man VAT registration number
         -  AMKA
            Αριθμός Μητρώου Κοινωνικής Ασφάλισης, Greek social security number
         -  FPA, ΦΠΑ, ΑΦΜ
            Αριθμός Φορολογικού Μητρώου, the Greek VAT number
         -  GRid
            Global Release Identifier
         -  GS1-128
            Standard to encode product information in Code 128 barcodes
         -  NIT
            Número de Identificación Tributaria, Guatemala tax number
         -  OIB
            Osobni identifikacijski broj, Croatian identification number
         -  ANUM
            Közösségi adószám, Hungarian VAT number
         -  IBAN
            International Bank Account Number
         -  NPWP
            Nomor Pokok Wajib Pajak, Indonesian VAT Number
         -  PPS No
            Personal Public Service Number, Irish personal number
         -  VAT
            Irish tax reference number
         -  Company Number
            מספר חברה, or short ח.פ. Israeli company number
         -  Identity Number
            Mispar Zehut, מספר זהות, Israeli identity number
         -  IMEI
            International Mobile Equipment Identity
         -  IMO number
            International Maritime Organization number
         -  IMSI
            International Mobile Subscriber Identity
         -  Aadhaar
            Indian digital resident personal identity number
         -  PAN
            Permanent Account Number, Indian income tax identifier
         -  Kennitala
            Icelandic personal and organisation identity code
         -  VSK number
            Virðisaukaskattsnúmer, Icelandic VAT number
         -  ISAN
            International Standard Audiovisual Number
         -  ISBN
            International Standard Book Number
         -  ISIL
            International Standard Identifier for Libraries
         -  ISIN
            International Securities Identification Number
         -  ISMN
            International Standard Music Number
         -  ISO 11649
            Structured Creditor Reference
         -  ISO 6346
            International standard for container identification
         -  ISSN
            International Standard Serial Number
         -  AIC
            Italian code for identification of drugs
         -  Codice Fiscale
            Italian tax code for individuals
         -  Partita IVA
            Italian VAT number
         -  CN
            法人番号, hōjin bangō, Japanese Corporate Number
         -  BRN
            사업자 등록 번호, South Korea Business Registration Number
         -  RRN
            South Korean resident registration number
         -  LEI
            Legal Entity Identifier
         -  PEID
            Liechtenstein tax code for individuals and entities
         -  Asmens kodas
            Lithuanian, personal numbers
         -  PVM
            Pridėtinės vertės mokestis mokėtojo kodas, Lithuanian VAT number
         -  TVA
            taxe sur la valeur ajoutée, Luxembourgian VAT number
         -  PVN
            Pievienotās vērtības nodokļa, Latvian VAT number
         -  MAC address
            Media Access Control address
         -  n° TVA
            taxe sur la valeur ajoutée, Monacan VAT number
         -  IDNO
            Moldavian company identification number
         -  Montenegro IBAN
            International Bank Account Number
         -  MEID
            Mobile Equipment Identifier
         -  VAT
            Maltese VAT number
         -  ID number
            Mauritian national identifier
         -  CURP
            Clave Única de Registro de Población, Mexican personal ID
         -  RFC
            Registro Federal de Contribuyentes, Mexican tax number
         -  NRIC No.
            Malaysian National Registration Identity Card Number
         -  BRIN number
            the Dutch school identification number
         -  BSN
            Burgerservicenummer, the Dutch citizen identification number
         -  Btw-identificatienummer
            Omzetbelastingnummer, the Dutch VAT number
         -  Onderwijsnummer
            the Dutch student identification number
         -  Postcode
            the Dutch postal code
         -  Fødselsnummer
            Norwegian birth number, the national identity number
         -  Norwegian IBAN
            International Bank Account Number
         -  Konto nr.
            Norwegian bank account number
         -  MVA
            Merverdiavgift, Norwegian VAT number
         -  Orgnr
            Organisasjonsnummer, Norwegian organisation number
         -  New Zealand bank account number
         -  IRD number
            New Zealand Inland Revenue Department
            Te Tari Tāke number
         -  CUI
            Cédula Única de Identidad, Peruvian identity number
         -  RUC
            Registro Único de Contribuyentes, Peruvian company tax number
         -  NIP
            Numer Identyfikacji Podatkowej, Polish VAT number
         -  PESEL
            Polish national identification number
         -  REGON
            Rejestr Gospodarki Narodowej, Polish register of economic units
         -  NIF
            Número de identificação fiscal, Portuguese VAT number
         -  RUC number
            Registro Único de Contribuyentes, Paraguay tax number
         -  CF
            Cod de înregistrare în scopuri de TVA, Romanian VAT number
         -  CNP
            Cod Numeric Personal, Romanian Numerical Personal Code
         -  CUI or CIF
            Codul Unic de Înregistrare, Romanian company identifier
         -  ONRC
            Ordine din Registrul Comerţului, Romanian Trade Register identifier
         -  PIB
            Poreski Identifikacioni Broj, Serbian tax identification number
         -  ИНН
            Идентификационный номер налогоплательщика, Russian tax identifier
         -  Orgnr
            Organisationsnummer, Swedish company number
         -  Personnummer
            Swedish personal identity number
         -  VAT
            Moms, Mervärdesskatt, Swedish VAT number
         -  UEN
            Singapore's Unique Entity Number
         -  ID za DDV
            Davčna številka, Slovenian VAT number
         -  IČ DPH
            IČ pre daň z pridanej hodnoty, Slovak VAT number
         -  RČ
            Rodné číslo, the Slovak birth number
         -  COE
            Codice operatore economico, San Marino national tax number
         -  NIT
            Número de Identificación Tributaria, El Salvador tax number
         -  T.C. Kimlik No.
            Turkish personal identification number
         -  VKN
            Vergi Kimlik Numarası, Turkish tax identification number
         -  UBN
            Unified Business Number, 統一編號, Taiwanese tax number
         -  ЄДРПОУ, EDRPOU
            Identifier for enterprises and organizations in Ukraine
         -  РНОКПП, RNTRC
            Individual taxpayer registration number in Ukraine
         -  ATIN
            U.S. Adoption Taxpayer Identification Number
         -  EIN
            U.S. Employer Identification Number
         -  ITIN
            U.S. Individual Taxpayer Identification Number
         -  PTIN
            U.S. Preparer Tax Identification Number
         -  RTN
            Routing transport number
         -  SSN
            U.S. Social Security Number
         -  TIN
            U.S. Taxpayer Identification Number
         -  RUT
            Registro Único Tributario, Uruguay tax number
         -  VATIN
            International value added tax identification number
         -  RIF
            Registro de Identificación Fiscal, Venezuelan VAT number
         -  MST
            Mã số thuế, Vietnam tax number
         -  ID number
            South African Identity Document number
         -  TIN
            South African Tax Identification Number
    output_format
        - 'compact'
            Format without any seperator.
        - 'standard'
            Standard format of recognized type.
    format_keyword
        Some keywords used for exactly recognizing the type of input number.
    inplace
        If True, delete the column containing the data that was cleaned. Otherwise,
        keep the original column.
        (default: False)
    report
        If True, output the summary report. Otherwise, no report is outputted.
        (default: True)
    progress
        If True, display a progress bar.
        (default: True)
    """

# Resources
   1. [python-stdnum](https://arthurdejong.org/python-stdnum/)