# A step-by-step guide to documenting a dataset

This notebook is for documenting a dataset using [Frictionless](https://frictionlessdata.io/introduction/). We use it because it is the most widely accepted open standard for tabular data, and it was developed by and for the open data community. This means it encodes the ethics of documenting in the best interest of people you have not met yet, but might be interested in re-using your data. This entails work to make explicit your implicit knowledge about the data, how they were collected, when and by whom, what they are intended for, relevant contextual information (example: "this dataset was collected during the COVID-19 pandemics, which is likely to have influenced responses ..."). 

The two main parts of the Frictionless project are: 

* An open standard to document datasets.
* Software that makes the work of cleaning, documenting and publishing open data much easier, notably the [Frictionless framework](https://v4.framework.frictionlessdata.io/).

The standard is called Data Package. The idea is this:

* All the data files you want to document as a coherent whole (normally `.csv` files) are in the same directory. For example, you might have a file containing the text of interviews and another one containing demographic characteristics of the people you interviewed. They contain different information, but are part of the same project. These files are called *resources*.
* In that same directory, you put a metadata file with a standardized name (like `datapackage.json`) that describes each resource, and each variable in each resource. The formats accepted for the metadata files are JSON and YAML. For qualitative data we prefer YAML, as it's more human readable.

There are two ways to use the framework: the Command Line Interface (CLI) or as a Pyton library. The documentation is more complete for the Python library, which is what we use here. What follows assumes you have Python 3, and have installed the Frictionless library. The framework's documentation contains an installation guide in case you don't have these components on your computer yet. 

## Step 1: import  the library

The `import frictionless` command imports the library. The rest of the stuff I have to do to make sure that Python finds the library's file, and for human-readable printing to screen

In [3]:
import sys
paths = ['', '/Users/albertocottica/Documents', '/Library/Frameworks/Python.framework/Versions/3.9/lib/python39.zip', '/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9', '/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib-dynload', '/Users/albertocottica/Library/Python/3.9/lib/python/site-packages', '/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages']
for p in paths:
    sys.path.append(p)
import frictionless as fl
import pprint

## Step 2: import the data

In [45]:
dirPath = './'
filename = dirPath + 'data/' + 'small_businesses_digitalization_survey.csv'
with open (filename, 'r') as csvfile:
    pprint.pprint(csvfile.read()) # get an idea of what it looks like

('_uuid,start,end,country,age_group,role_in_business,female,business_sector,business_age,business_size,bank_account,business_registration,social_security,union,formal_none,formalization_level,formal,digital_tools,whatsapp,facebook,mobile_wallet,grab_uber_jumla,delivery_app,digital_tools_other,digitalization_level,attract_customers,order_materials,process_payments,manage_logistics,purpose_other,digitalization_purpose,covid_digitalization,stopped_using,helped_increase_sales,helped_find_new_customers,helped_find_cheaper_materials,helped_save_time,helped_track_money,helped_reduce_hassles,helped_get_customers,helped_cut_intermediaries,helped_seek_grants,digitalization_help_count,negative_effects,sales_change,expensive,dontknowhowtouse,havent_tried,bad_internet,dont_need,give_information,nondigitalization_reason_other\n'
 'a0dc3688-2144-4ffd-aea8-00e8e672cd47,2022-03-30,2022-03-30,Fiji/ '
 'Vanuatu,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and '
 '5 years ago,Its only 

## Step 3: validate

Check whether there are any errors in the data. 

In [46]:
report = fl.validate(filename)
print(report.to_summary)

<bound method Report.to_summary of {'errors': [],
 'stats': {'errors': 0, 'tasks': 1},
 'tasks': [{'errors': [],
            'partial': False,
            'resource': {'encoding': 'utf-8',
                         'format': 'csv',
                         'hashing': 'md5',
                         'name': 'small_businesses_digitalization_survey',
                         'path': './data/small_businesses_digitalization_survey.csv',
                         'profile': 'tabular-data-resource',
                         'schema': {'fields': [{'name': '_uuid',
                                                'type': 'string'},
                                               {'name': 'start',
                                                'type': 'date'},
                                               {'name': 'end', 'type': 'date'},
                                               {'name': 'country',
                                                'type': 'string'},
                            

## Step 4: create the metadata document

The easiest way to do this is to use the software to automatically generate a YAML file with all the appropriate variables, then edit those variables manually. The variables' names and types are automatically created, but you will need to add descriptions manually. In our case, we refer to the questionnaire to do that. 

Pay close attention to the types of variables. If Frictionless detects a variable that contains `Yes` in some cells and `No` in some others, it will most likely treat it as a string variable, when it is really a Boolean. Similarly, values like `24/09/2015` are likely to be read as strings, when they are really dates. In our case, many answers have been encoded as `0` or `1` and are treated as integers, but are really Booleans, so we change the metadata file to take that into account. 

Assigning the correct type to each variable helps the people that will use your data in the future. They will be able to import your data assigning already the correct data type, therefore avoiding manual conversions.

In [47]:
packagename = dirPath + 'data/' + 'small_businesses_digitalization_survey.csv'
print(packagename)
package = fl.describe(packagename, trusted = True)
print(package.to_yaml())

./data/small_businesses_digitalization_survey.csv
path: ./data/small_businesses_digitalization_survey.csv
name: small_businesses_digitalization_survey
profile: tabular-data-resource
scheme: file
format: csv
hashing: md5
encoding: utf-8
schema:
  fields:
    - type: string
      name: _uuid
    - type: date
      name: start
    - type: date
      name: end
    - type: string
      name: country
    - type: string
      name: age_group
    - type: string
      name: role_in_business
    - type: boolean
      name: female
    - type: string
      name: business_sector
    - type: string
      name: business_age
    - type: string
      name: business_size
    - type: boolean
      name: bank_account
    - type: boolean
      name: business_registration
    - type: boolean
      name: social_security
    - type: boolean
      name: union
    - type: boolean
      name: formal_none
    - type: number
      name: formalization_level
    - type: number
      name: formal
    - type: boolean


# Step 5: manually edit the YAML file and convert to JSON

Todo

In [48]:
packagej = fl.Package(dirPath + 'data/' + 'small_businesses_digitalization_survey.yaml')
# print(packagej.to_json(dirPath + 'data/Informality paper/' + 'datapackage.json'))
report = fl.validate(dirPath + 'data/' + 'datapackage.json')
print(report.to_summary)

<bound method Report.to_summary of {'errors': [{'code': 'error',
             'description': 'Error',
             'message': 'cannot extract metadata "tabular-data-resource" '
                        'because "[Errno 2] No such file or directory: '
                        '\'tabular-data-resource\'"',
             'name': 'Error',
             'note': 'cannot extract metadata "tabular-data-resource" because '
                     '"[Errno 2] No such file or directory: '
                     '\'tabular-data-resource\'"',
             'tags': []}],
 'stats': {'errors': 1, 'tasks': 0},
 'tasks': [],
 'time': 0.0,
 'valid': False,
 'version': '4.40.8'}>


In [6]:
import pandas as pd
import numpy as np
df = pd.read_csv(filename)
df

Unnamed: 0,_uuid,start,end,country,age_group,role_in_business,female,business_sector,business_age,business_size,...,negative_effects,sales_change,expensive,dontknowhowtouse,havent_tried,bad_internet,dont_need,give_information,nondigitalization_reason_other,nondigitalization_reasons
0,a0dc3688-2144-4ffd-aea8-00e8e672cd47,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Its only me,...,No,4 - I sell a lot more,,,,,,,,
1,95cf8a5a-950b-417b-b5b6-25f286dd7b01,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,Other,True,Selling food or drinks,Less than 1 year ago,Between 3 and 10 employees,...,I don't know,4 - I sell a lot more,,,,,,,,
2,99dcf5ee-06b2-4320-82e7-7d307cc27d77,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Other,Between 1 and 2 years ago,Its only me,...,No,2 - Sales have stayed the same,,,,,,,,
3,eec5841b-b060-416f-89f5-f2dfa96d1cf9,2022-03-30,2022-03-30,Ethiopia,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Less than 3 employees,...,Yes,3 - I sell a bit more,,,,,,,,
4,d3ce729d-bd66-4763-8884-a3de017eab8a,2022-03-30,2022-03-30,Ethiopia,36-50 years old,I am the owner/founder,True,Tourism and Hospitality,Between 1 and 2 years ago,Between 3 and 10 employees,...,I don't know,4 - I sell a lot more,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1008,67361469-42f4-4f01-90ed-09505cc1a3aa,2022-04-12,2022-04-12,morocco,20-35 years old,I am the owner/founder,True,Other,Between 2 and 3 years ago,Its only me,...,No,3 - I sell a bit more,,,,,,,,
1009,bfe5b1ef-4ca1-47dd-a0de-ee7d1e605a92,2022-04-13,2022-04-13,morocco,20-35 years old,I am the owner/founder,True,Manufacturing / Factory,Between 3 and 5 years ago,Between 3 and 10 employees,...,No,2 - Sales have stayed the same,,,,,,,,
1010,9526a9ca-d7dc-4d1d-8a2a-efff6e6cb882,2022-04-14,2022-04-14,morocco,50-65 years old,I am the owner/founder,True,Tourism and Hospitality,Between 3 and 5 years ago,Between 3 and 10 employees,...,No,3 - I sell a bit more,,,,,,,,
1011,b37b1d36-4f31-4fec-a9e7-344f38673997,2022-04-14,2022-04-14,morocco,20-35 years old,Other,True,Education,Between 3 and 5 years ago,Between 3 and 10 employees,...,Yes,1 - I sell a bit less,,,,,,,,


In [18]:
df.insert(6, 'female', df.pop('female'))
df

Unnamed: 0,_uuid,start,end,country,age_group,role_in_business,female,business_sector,business_age,business_size,...,negative_effects,sales_change,expensive,dontknowhowtouse,havent_tried,bad_internet,dont_need,give_information,nondigitalization_reason_other,nondigitalization_reasons
0,a0dc3688-2144-4ffd-aea8-00e8e672cd47,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Its only me,...,No,4 - I sell a lot more,,,,,,,,
1,95cf8a5a-950b-417b-b5b6-25f286dd7b01,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,Other,True,Selling food or drinks,Less than 1 year ago,Between 3 and 10 employees,...,I don't know,4 - I sell a lot more,,,,,,,,
2,99dcf5ee-06b2-4320-82e7-7d307cc27d77,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Other,Between 1 and 2 years ago,Its only me,...,No,2 - Sales have stayed the same,,,,,,,,
3,eec5841b-b060-416f-89f5-f2dfa96d1cf9,2022-03-30,2022-03-30,Ethiopia,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Less than 3 employees,...,Yes,3 - I sell a bit more,,,,,,,,
4,d3ce729d-bd66-4763-8884-a3de017eab8a,2022-03-30,2022-03-30,Ethiopia,36-50 years old,I am the owner/founder,True,Tourism and Hospitality,Between 1 and 2 years ago,Between 3 and 10 employees,...,I don't know,4 - I sell a lot more,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1008,67361469-42f4-4f01-90ed-09505cc1a3aa,2022-04-12,2022-04-12,morocco,20-35 years old,I am the owner/founder,True,Other,Between 2 and 3 years ago,Its only me,...,No,3 - I sell a bit more,,,,,,,,
1009,bfe5b1ef-4ca1-47dd-a0de-ee7d1e605a92,2022-04-13,2022-04-13,morocco,20-35 years old,I am the owner/founder,True,Manufacturing / Factory,Between 3 and 5 years ago,Between 3 and 10 employees,...,No,2 - Sales have stayed the same,,,,,,,,
1010,9526a9ca-d7dc-4d1d-8a2a-efff6e6cb882,2022-04-14,2022-04-14,morocco,50-65 years old,I am the owner/founder,True,Tourism and Hospitality,Between 3 and 5 years ago,Between 3 and 10 employees,...,No,3 - I sell a bit more,,,,,,,,
1011,b37b1d36-4f31-4fec-a9e7-344f38673997,2022-04-14,2022-04-14,morocco,20-35 years old,Other,True,Education,Between 3 and 5 years ago,Between 3 and 10 employees,...,Yes,1 - I sell a bit less,,,,,,,,


In [56]:
df.to_csv(filename, index=False)

In [7]:
pd.set_option('display.max_columns', None)


In [8]:
df

Unnamed: 0,_uuid,start,end,country,age_group,role_in_business,female,business_sector,business_age,business_size,bank_account,business_registration,social_security,union,formal_none,formalization_level,formal,digital_tools,whatsapp,facebook,mobile_wallet,grab_uber_jumla,delivery_app,digital_tools_other,digitalization_level,attract_customers,order_materials,process_payments,manage_logistics,purpose_other,digitalization_purpose,covid_digitalization,stopped_using,helped_increase_sales,helped_find_new_customers,helped_find_cheaper_materials,helped_save_time,helped_track_money,helped_reduce_hassles,helped_get_customers,helped_cut_intermediaries,helped_seek_grants,digitalization_help_count,negative_effects,sales_change,expensive,dontknowhowtouse,havent_tried,bad_internet,dont_need,give_information,nondigitalization_reason_other,nondigitalization_reasons
0,a0dc3688-2144-4ffd-aea8-00e8e672cd47,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Its only me,True,False,False,False,0.0,1.0,1.0,True,False,True,False,False,False,True,2.0,True,False,False,True,True,3.0,False,False,True,True,False,True,False,False,True,False,True,5.0,No,4 - I sell a lot more,,,,,,,,
1,95cf8a5a-950b-417b-b5b6-25f286dd7b01,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,Other,True,Selling food or drinks,Less than 1 year ago,Between 3 and 10 employees,False,False,False,False,1.0,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,False,False,False,1.0,True,False,True,False,False,True,False,False,True,False,False,3.0,I don't know,4 - I sell a lot more,,,,,,,,
2,99dcf5ee-06b2-4320-82e7-7d307cc27d77,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Other,Between 1 and 2 years ago,Its only me,False,False,False,False,1.0,0.0,0.0,True,True,True,False,False,False,True,3.0,True,True,False,True,False,3.0,True,False,True,True,True,True,False,False,True,False,False,5.0,No,2 - Sales have stayed the same,,,,,,,,
3,eec5841b-b060-416f-89f5-f2dfa96d1cf9,2022-03-30,2022-03-30,Ethiopia,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Less than 3 employees,True,False,False,False,0.0,1.0,1.0,True,False,True,False,False,False,False,1.0,True,True,False,False,False,2.0,True,False,False,True,False,False,False,False,True,False,False,2.0,Yes,3 - I sell a bit more,,,,,,,,
4,d3ce729d-bd66-4763-8884-a3de017eab8a,2022-03-30,2022-03-30,Ethiopia,36-50 years old,I am the owner/founder,True,Tourism and Hospitality,Between 1 and 2 years ago,Between 3 and 10 employees,False,False,False,False,1.0,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,False,False,False,1.0,True,False,True,True,False,True,False,False,False,False,False,3.0,I don't know,4 - I sell a lot more,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1008,67361469-42f4-4f01-90ed-09505cc1a3aa,2022-04-12,2022-04-12,morocco,20-35 years old,I am the owner/founder,True,Other,Between 2 and 3 years ago,Its only me,True,True,False,False,0.0,2.0,1.0,True,True,True,False,False,False,True,3.0,True,True,False,False,True,3.0,False,False,True,True,False,True,False,False,True,False,True,5.0,No,3 - I sell a bit more,,,,,,,,
1009,bfe5b1ef-4ca1-47dd-a0de-ee7d1e605a92,2022-04-13,2022-04-13,morocco,20-35 years old,I am the owner/founder,True,Manufacturing / Factory,Between 3 and 5 years ago,Between 3 and 10 employees,True,False,True,True,1.0,2.0,1.0,True,True,False,True,False,False,False,2.0,True,True,False,True,False,3.0,False,False,False,False,False,True,False,False,False,False,False,1.0,No,2 - Sales have stayed the same,,,,,,,,
1010,9526a9ca-d7dc-4d1d-8a2a-efff6e6cb882,2022-04-14,2022-04-14,morocco,50-65 years old,I am the owner/founder,True,Tourism and Hospitality,Between 3 and 5 years ago,Between 3 and 10 employees,True,True,True,True,0.0,3.0,1.0,True,True,True,True,False,False,False,3.0,True,False,False,False,False,1.0,False,False,False,True,False,True,False,True,False,False,False,3.0,No,3 - I sell a bit more,,,,,,,,
1011,b37b1d36-4f31-4fec-a9e7-344f38673997,2022-04-14,2022-04-14,morocco,20-35 years old,Other,True,Education,Between 3 and 5 years ago,Between 3 and 10 employees,,,,,,,,True,True,True,False,False,False,True,3.0,True,False,True,False,True,3.0,True,False,False,False,True,True,False,True,False,False,False,3.0,Yes,1 - I sell a bit less,,,,,,,,


In [170]:
filename2 = dirPath + 'sm_formality_digitalization_data.csv'
df2 = pd.read_csv(filename2)
df2

Unnamed: 0,_uuid,start,end,country,age_group,female,role_in_business,business_sector,business_age,business_size,bank_account,business_registration,social_security,union,formal_none,formalization_level_with_SS,formalization_level,formal,digital_tools,whatsapp,facebook,grab_uber_jumla,mobile_wallet,delivery_app,digital_tools_other,digitalization_level,attract_customers,order_materials,process_payments,manage_logistics,purpose_other,digitalization_purpose,covid_digitalization,stopped_using,helped_increase_sales,helped_find_new_customers,helped_find_cheaper_materials,helped_save_time,helped_track_money,helped_get_customers,helped_reduce_hassles,helped_cut_intermediaries,helped_seek_grants,digitalization_help_count,negative_effects,sales_change,expensive,dontknowhowtouse,havent_tried,bad_internet,dont_need,give_information,nondigitalization_reason_other,nondigitalization_reasons
0,ce616fca-bede-4b3f-9ed9-87feeeef10c1,30/03/2022,30/03/2022,Fiji/ Vanuatu,20-35 years old,female,I am the owner/founder,Handcrafts,Between 3 and 5 years ago,Its only me,1.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,"Yes, I use digital tools to run my business",0.0,1.0,0.0,0.0,0.0,1.0,2.0,1.0,0.0,0.0,1.0,1.0,3.0,No,No,1.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,5.0,No,I sell a lot more,,,,,,,,
1,c931e182-8cbe-4712-94c5-5bc7083f2f13,30/03/2022,30/03/2022,Fiji/ Vanuatu,20-35 years old,female,Other,Selling food or drinks,Less than 1 year ago,Between 3 and 10 employees,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,"Yes, I use digital tools to run my business",0.0,1.0,0.0,0.0,0.0,1.0,2.0,1.0,0.0,0.0,0.0,0.0,1.0,Yes,No,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,3.0,I don't know,I sell a lot more,,,,,,,,
2,cc9c7d65-6cb9-4b0d-8183-709aea540969,30/03/2022,30/03/2022,Fiji/ Vanuatu,20-35 years old,female,I am the owner/founder,Other,Between 1 and 2 years ago,Its only me,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,"Yes, I use digital tools to run my business",1.0,1.0,0.0,0.0,0.0,1.0,3.0,1.0,1.0,0.0,1.0,0.0,3.0,Yes,No,1.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,0.0,5.0,No,Sales have stayed the same,,,,,,,,
3,0391370b-4761-44b7-a32e-c9614788e4f5,30/03/2022,30/03/2022,Ethiopia,20-35 years old,female,I am the owner/founder,Handcrafts,Between 3 and 5 years ago,Less than 3 employees,1.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,"Yes, I use digital tools to run my business",0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,2.0,Yes,No,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,2.0,Yes,I sell a bit more,,,,,,,,
4,cb81c7ec-20d4-4340-9b35-b7a8fe87c319,30/03/2022,30/03/2022,Ethiopia,36-50 years old,female,I am the owner/founder,Tourism and Hospitality,Between 1 and 2 years ago,Between 3 and 10 employees,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,"Yes, I use digital tools to run my business",0.0,1.0,0.0,0.0,0.0,1.0,2.0,1.0,0.0,0.0,0.0,0.0,1.0,Yes,No,1.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,3.0,I don't know,I sell a lot more,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1008,278530100,12/04/2022,12/04/2022,morocco,20-35 years old,male,I am the owner/founder,Other,Between 2 and 3 years ago,Its only me,1.0,1.0,0.0,0.0,0.0,2.0,2.0,1.0,"Yes, I use digital tools to run my business",1.0,1.0,0.0,0.0,0.0,1.0,3.0,1.0,1.0,0.0,0.0,1.0,3.0,No,No,1.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,5.0,No,I sell a bit more,,,,,,,,
1009,278596718,13/04/2022,13/04/2022,morocco,20-35 years old,male,I am the owner/founder,Manufacturing / Factory,Between 3 and 5 years ago,Between 3 and 10 employees,1.0,0.0,1.0,1.0,1.0,3.0,2.0,1.0,"Yes, I use digital tools to run my business",1.0,0.0,0.0,1.0,0.0,0.0,2.0,1.0,1.0,0.0,1.0,0.0,3.0,No,No,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,No,Sales have stayed the same,,,,,,,,
1010,278901946,14/04/2022,14/04/2022,morocco,50-65 years old,male,I am the owner/founder,Tourism and Hospitality,Between 3 and 5 years ago,Between 3 and 10 employees,1.0,1.0,1.0,1.0,0.0,4.0,3.0,1.0,"Yes, I use digital tools to run my business",1.0,1.0,0.0,1.0,0.0,0.0,3.0,1.0,0.0,0.0,0.0,0.0,1.0,No,No,0.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,3.0,No,I sell a bit more,,,,,,,,
1011,279034373,14/04/2022,14/04/2022,morocco,20-35 years old,male,Other,Education,Between 3 and 5 years ago,Between 3 and 10 employees,,,,,,,,,"Yes, I use digital tools to run my business",1.0,1.0,0.0,0.0,0.0,1.0,3.0,1.0,0.0,1.0,0.0,1.0,3.0,Yes,No,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,3.0,Yes,I sell a bit less,,,,,,,,


In [9]:
df = pd.read_csv(filename)
print(df.shape)
df.insert(15, 'formal_none1', np.nan)
# df.drop(df.columns[[0,1,2]], axis=1, inplace=True)
df

(1013, 53)


Unnamed: 0,_uuid,start,end,country,age_group,role_in_business,female,business_sector,business_age,business_size,bank_account,business_registration,social_security,union,formal_none,formal_none1,formalization_level,formal,digital_tools,whatsapp,facebook,mobile_wallet,grab_uber_jumla,delivery_app,digital_tools_other,digitalization_level,attract_customers,order_materials,process_payments,manage_logistics,purpose_other,digitalization_purpose,covid_digitalization,stopped_using,helped_increase_sales,helped_find_new_customers,helped_find_cheaper_materials,helped_save_time,helped_track_money,helped_reduce_hassles,helped_get_customers,helped_cut_intermediaries,helped_seek_grants,digitalization_help_count,negative_effects,sales_change,expensive,dontknowhowtouse,havent_tried,bad_internet,dont_need,give_information,nondigitalization_reason_other,nondigitalization_reasons
0,a0dc3688-2144-4ffd-aea8-00e8e672cd47,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Its only me,True,False,False,False,0.0,,1.0,1.0,True,False,True,False,False,False,True,2.0,True,False,False,True,True,3.0,False,False,True,True,False,True,False,False,True,False,True,5.0,No,4 - I sell a lot more,,,,,,,,
1,95cf8a5a-950b-417b-b5b6-25f286dd7b01,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,Other,True,Selling food or drinks,Less than 1 year ago,Between 3 and 10 employees,False,False,False,False,1.0,,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,False,False,False,1.0,True,False,True,False,False,True,False,False,True,False,False,3.0,I don't know,4 - I sell a lot more,,,,,,,,
2,99dcf5ee-06b2-4320-82e7-7d307cc27d77,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Other,Between 1 and 2 years ago,Its only me,False,False,False,False,1.0,,0.0,0.0,True,True,True,False,False,False,True,3.0,True,True,False,True,False,3.0,True,False,True,True,True,True,False,False,True,False,False,5.0,No,2 - Sales have stayed the same,,,,,,,,
3,eec5841b-b060-416f-89f5-f2dfa96d1cf9,2022-03-30,2022-03-30,Ethiopia,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Less than 3 employees,True,False,False,False,0.0,,1.0,1.0,True,False,True,False,False,False,False,1.0,True,True,False,False,False,2.0,True,False,False,True,False,False,False,False,True,False,False,2.0,Yes,3 - I sell a bit more,,,,,,,,
4,d3ce729d-bd66-4763-8884-a3de017eab8a,2022-03-30,2022-03-30,Ethiopia,36-50 years old,I am the owner/founder,True,Tourism and Hospitality,Between 1 and 2 years ago,Between 3 and 10 employees,False,False,False,False,1.0,,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,False,False,False,1.0,True,False,True,True,False,True,False,False,False,False,False,3.0,I don't know,4 - I sell a lot more,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1008,67361469-42f4-4f01-90ed-09505cc1a3aa,2022-04-12,2022-04-12,morocco,20-35 years old,I am the owner/founder,True,Other,Between 2 and 3 years ago,Its only me,True,True,False,False,0.0,,2.0,1.0,True,True,True,False,False,False,True,3.0,True,True,False,False,True,3.0,False,False,True,True,False,True,False,False,True,False,True,5.0,No,3 - I sell a bit more,,,,,,,,
1009,bfe5b1ef-4ca1-47dd-a0de-ee7d1e605a92,2022-04-13,2022-04-13,morocco,20-35 years old,I am the owner/founder,True,Manufacturing / Factory,Between 3 and 5 years ago,Between 3 and 10 employees,True,False,True,True,1.0,,2.0,1.0,True,True,False,True,False,False,False,2.0,True,True,False,True,False,3.0,False,False,False,False,False,True,False,False,False,False,False,1.0,No,2 - Sales have stayed the same,,,,,,,,
1010,9526a9ca-d7dc-4d1d-8a2a-efff6e6cb882,2022-04-14,2022-04-14,morocco,50-65 years old,I am the owner/founder,True,Tourism and Hospitality,Between 3 and 5 years ago,Between 3 and 10 employees,True,True,True,True,0.0,,3.0,1.0,True,True,True,True,False,False,False,3.0,True,False,False,False,False,1.0,False,False,False,True,False,True,False,True,False,False,False,3.0,No,3 - I sell a bit more,,,,,,,,
1011,b37b1d36-4f31-4fec-a9e7-344f38673997,2022-04-14,2022-04-14,morocco,20-35 years old,Other,True,Education,Between 3 and 5 years ago,Between 3 and 10 employees,,,,,,,,,True,True,True,False,False,False,True,3.0,True,False,True,False,True,3.0,True,False,False,False,True,True,False,True,False,False,False,3.0,Yes,1 - I sell a bit less,,,,,,,,


In [10]:
df.loc[df['formal_none'] == 0, 'formal_none1'] = False
df.loc[df['formal_none'] == 1, 'formal_none'] = True
df

Unnamed: 0,_uuid,start,end,country,age_group,role_in_business,female,business_sector,business_age,business_size,bank_account,business_registration,social_security,union,formal_none,formal_none1,formalization_level,formal,digital_tools,whatsapp,facebook,mobile_wallet,grab_uber_jumla,delivery_app,digital_tools_other,digitalization_level,attract_customers,order_materials,process_payments,manage_logistics,purpose_other,digitalization_purpose,covid_digitalization,stopped_using,helped_increase_sales,helped_find_new_customers,helped_find_cheaper_materials,helped_save_time,helped_track_money,helped_reduce_hassles,helped_get_customers,helped_cut_intermediaries,helped_seek_grants,digitalization_help_count,negative_effects,sales_change,expensive,dontknowhowtouse,havent_tried,bad_internet,dont_need,give_information,nondigitalization_reason_other,nondigitalization_reasons
0,a0dc3688-2144-4ffd-aea8-00e8e672cd47,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Its only me,True,False,False,False,0,False,1.0,1.0,True,False,True,False,False,False,True,2.0,True,False,False,True,True,3.0,False,False,True,True,False,True,False,False,True,False,True,5.0,No,4 - I sell a lot more,,,,,,,,
1,95cf8a5a-950b-417b-b5b6-25f286dd7b01,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,Other,True,Selling food or drinks,Less than 1 year ago,Between 3 and 10 employees,False,False,False,False,True,,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,False,False,False,1.0,True,False,True,False,False,True,False,False,True,False,False,3.0,I don't know,4 - I sell a lot more,,,,,,,,
2,99dcf5ee-06b2-4320-82e7-7d307cc27d77,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Other,Between 1 and 2 years ago,Its only me,False,False,False,False,True,,0.0,0.0,True,True,True,False,False,False,True,3.0,True,True,False,True,False,3.0,True,False,True,True,True,True,False,False,True,False,False,5.0,No,2 - Sales have stayed the same,,,,,,,,
3,eec5841b-b060-416f-89f5-f2dfa96d1cf9,2022-03-30,2022-03-30,Ethiopia,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Less than 3 employees,True,False,False,False,0,False,1.0,1.0,True,False,True,False,False,False,False,1.0,True,True,False,False,False,2.0,True,False,False,True,False,False,False,False,True,False,False,2.0,Yes,3 - I sell a bit more,,,,,,,,
4,d3ce729d-bd66-4763-8884-a3de017eab8a,2022-03-30,2022-03-30,Ethiopia,36-50 years old,I am the owner/founder,True,Tourism and Hospitality,Between 1 and 2 years ago,Between 3 and 10 employees,False,False,False,False,True,,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,False,False,False,1.0,True,False,True,True,False,True,False,False,False,False,False,3.0,I don't know,4 - I sell a lot more,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1008,67361469-42f4-4f01-90ed-09505cc1a3aa,2022-04-12,2022-04-12,morocco,20-35 years old,I am the owner/founder,True,Other,Between 2 and 3 years ago,Its only me,True,True,False,False,0,False,2.0,1.0,True,True,True,False,False,False,True,3.0,True,True,False,False,True,3.0,False,False,True,True,False,True,False,False,True,False,True,5.0,No,3 - I sell a bit more,,,,,,,,
1009,bfe5b1ef-4ca1-47dd-a0de-ee7d1e605a92,2022-04-13,2022-04-13,morocco,20-35 years old,I am the owner/founder,True,Manufacturing / Factory,Between 3 and 5 years ago,Between 3 and 10 employees,True,False,True,True,True,,2.0,1.0,True,True,False,True,False,False,False,2.0,True,True,False,True,False,3.0,False,False,False,False,False,True,False,False,False,False,False,1.0,No,2 - Sales have stayed the same,,,,,,,,
1010,9526a9ca-d7dc-4d1d-8a2a-efff6e6cb882,2022-04-14,2022-04-14,morocco,50-65 years old,I am the owner/founder,True,Tourism and Hospitality,Between 3 and 5 years ago,Between 3 and 10 employees,True,True,True,True,0,False,3.0,1.0,True,True,True,True,False,False,False,3.0,True,False,False,False,False,1.0,False,False,False,True,False,True,False,True,False,False,False,3.0,No,3 - I sell a bit more,,,,,,,,
1011,b37b1d36-4f31-4fec-a9e7-344f38673997,2022-04-14,2022-04-14,morocco,20-35 years old,Other,True,Education,Between 3 and 5 years ago,Between 3 and 10 employees,,,,,,,,,True,True,True,False,False,False,True,3.0,True,False,True,False,True,3.0,True,False,False,False,True,True,False,True,False,False,False,3.0,Yes,1 - I sell a bit less,,,,,,,,


In [11]:
df.drop(['formal_none'], axis = 1, inplace = True)
df.rename(columns = {'formal_none1': 'formal_none'}, inplace=True)
df

Unnamed: 0,_uuid,start,end,country,age_group,role_in_business,female,business_sector,business_age,business_size,bank_account,business_registration,social_security,union,formal_none,formalization_level,formal,digital_tools,whatsapp,facebook,mobile_wallet,grab_uber_jumla,delivery_app,digital_tools_other,digitalization_level,attract_customers,order_materials,process_payments,manage_logistics,purpose_other,digitalization_purpose,covid_digitalization,stopped_using,helped_increase_sales,helped_find_new_customers,helped_find_cheaper_materials,helped_save_time,helped_track_money,helped_reduce_hassles,helped_get_customers,helped_cut_intermediaries,helped_seek_grants,digitalization_help_count,negative_effects,sales_change,expensive,dontknowhowtouse,havent_tried,bad_internet,dont_need,give_information,nondigitalization_reason_other,nondigitalization_reasons
0,a0dc3688-2144-4ffd-aea8-00e8e672cd47,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Its only me,True,False,False,False,False,1.0,1.0,True,False,True,False,False,False,True,2.0,True,False,False,True,True,3.0,False,False,True,True,False,True,False,False,True,False,True,5.0,No,4 - I sell a lot more,,,,,,,,
1,95cf8a5a-950b-417b-b5b6-25f286dd7b01,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,Other,True,Selling food or drinks,Less than 1 year ago,Between 3 and 10 employees,False,False,False,False,,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,False,False,False,1.0,True,False,True,False,False,True,False,False,True,False,False,3.0,I don't know,4 - I sell a lot more,,,,,,,,
2,99dcf5ee-06b2-4320-82e7-7d307cc27d77,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Other,Between 1 and 2 years ago,Its only me,False,False,False,False,,0.0,0.0,True,True,True,False,False,False,True,3.0,True,True,False,True,False,3.0,True,False,True,True,True,True,False,False,True,False,False,5.0,No,2 - Sales have stayed the same,,,,,,,,
3,eec5841b-b060-416f-89f5-f2dfa96d1cf9,2022-03-30,2022-03-30,Ethiopia,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Less than 3 employees,True,False,False,False,False,1.0,1.0,True,False,True,False,False,False,False,1.0,True,True,False,False,False,2.0,True,False,False,True,False,False,False,False,True,False,False,2.0,Yes,3 - I sell a bit more,,,,,,,,
4,d3ce729d-bd66-4763-8884-a3de017eab8a,2022-03-30,2022-03-30,Ethiopia,36-50 years old,I am the owner/founder,True,Tourism and Hospitality,Between 1 and 2 years ago,Between 3 and 10 employees,False,False,False,False,,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,False,False,False,1.0,True,False,True,True,False,True,False,False,False,False,False,3.0,I don't know,4 - I sell a lot more,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1008,67361469-42f4-4f01-90ed-09505cc1a3aa,2022-04-12,2022-04-12,morocco,20-35 years old,I am the owner/founder,True,Other,Between 2 and 3 years ago,Its only me,True,True,False,False,False,2.0,1.0,True,True,True,False,False,False,True,3.0,True,True,False,False,True,3.0,False,False,True,True,False,True,False,False,True,False,True,5.0,No,3 - I sell a bit more,,,,,,,,
1009,bfe5b1ef-4ca1-47dd-a0de-ee7d1e605a92,2022-04-13,2022-04-13,morocco,20-35 years old,I am the owner/founder,True,Manufacturing / Factory,Between 3 and 5 years ago,Between 3 and 10 employees,True,False,True,True,,2.0,1.0,True,True,False,True,False,False,False,2.0,True,True,False,True,False,3.0,False,False,False,False,False,True,False,False,False,False,False,1.0,No,2 - Sales have stayed the same,,,,,,,,
1010,9526a9ca-d7dc-4d1d-8a2a-efff6e6cb882,2022-04-14,2022-04-14,morocco,50-65 years old,I am the owner/founder,True,Tourism and Hospitality,Between 3 and 5 years ago,Between 3 and 10 employees,True,True,True,True,False,3.0,1.0,True,True,True,True,False,False,False,3.0,True,False,False,False,False,1.0,False,False,False,True,False,True,False,True,False,False,False,3.0,No,3 - I sell a bit more,,,,,,,,
1011,b37b1d36-4f31-4fec-a9e7-344f38673997,2022-04-14,2022-04-14,morocco,20-35 years old,Other,True,Education,Between 3 and 5 years ago,Between 3 and 10 employees,,,,,,,,True,True,True,False,False,False,True,3.0,True,False,True,False,True,3.0,True,False,False,False,True,True,False,True,False,False,False,3.0,Yes,1 - I sell a bit less,,,,,,,,


In [15]:
df.to_csv(filename, index=False)

In [184]:
column_to_move = df.pop('grab_uber_jumla')
df.insert(21, 'grab_uber_jumla', column_to_move)
df

Unnamed: 0,_uuid,start,end,country,age_group,role_in_business,female,business_sector,business_age,business_size,bank_account,business_registration,social_security,union,formal_none,formalization_level,formal,digital_tools,whatsapp,facebook,mobile_wallet,grab_uber_jumla,delivery_app,digital_tools_other,digitalization_level,attract_customers,order_materials,process_payments1,process_payments,manage_logistics,purpose_other,digitalization_purpose,covid_digitalization,stopped_using,helped_increase_sales,helped_find_new_customers,helped_find_cheaper_materials,helped_save_time,helped_track_money,helped_get_customers,helped_reduce_hassles,helped_cut_intermediaries,helped_seek_grants,digitalization_help_count,negative_effects,sales_change,expensive,dontknowhowtouse,havent_tried,bad_internet,dont_need,give_information,nondigitalization_reason_other,nondigitalization_reasons
0,a0dc3688-2144-4ffd-aea8-00e8e672cd47,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Its only me,True,False,False,False,0.0,1.0,1.0,True,False,True,False,False,False,True,2.0,True,False,,0.0,1.0,1.0,3.0,No,No,1.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,5.0,No,4 - I sell a lot more,,,,,,,,
1,95cf8a5a-950b-417b-b5b6-25f286dd7b01,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,Other,True,Selling food or drinks,Less than 1 year ago,Between 3 and 10 employees,False,False,False,False,1.0,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,,0.0,0.0,0.0,1.0,Yes,No,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,3.0,I don't know,4 - I sell a lot more,,,,,,,,
2,99dcf5ee-06b2-4320-82e7-7d307cc27d77,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Other,Between 1 and 2 years ago,Its only me,False,False,False,False,1.0,0.0,0.0,True,True,True,False,False,False,True,3.0,True,True,,0.0,1.0,0.0,3.0,Yes,No,1.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,0.0,5.0,No,2 - Sales have stayed the same,,,,,,,,
3,eec5841b-b060-416f-89f5-f2dfa96d1cf9,2022-03-30,2022-03-30,Ethiopia,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Less than 3 employees,True,False,False,False,0.0,1.0,1.0,True,False,True,False,False,False,False,1.0,True,True,,0.0,0.0,0.0,2.0,Yes,No,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,2.0,Yes,3 - I sell a bit more,,,,,,,,
4,d3ce729d-bd66-4763-8884-a3de017eab8a,2022-03-30,2022-03-30,Ethiopia,36-50 years old,I am the owner/founder,True,Tourism and Hospitality,Between 1 and 2 years ago,Between 3 and 10 employees,False,False,False,False,1.0,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,,0.0,0.0,0.0,1.0,Yes,No,1.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,3.0,I don't know,4 - I sell a lot more,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1008,67361469-42f4-4f01-90ed-09505cc1a3aa,2022-04-12,2022-04-12,morocco,20-35 years old,I am the owner/founder,True,Other,Between 2 and 3 years ago,Its only me,True,True,False,False,0.0,2.0,1.0,True,True,True,False,False,False,True,3.0,True,True,,0.0,0.0,1.0,3.0,No,No,1.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,5.0,No,3 - I sell a bit more,,,,,,,,
1009,bfe5b1ef-4ca1-47dd-a0de-ee7d1e605a92,2022-04-13,2022-04-13,morocco,20-35 years old,I am the owner/founder,True,Manufacturing / Factory,Between 3 and 5 years ago,Between 3 and 10 employees,True,False,True,True,1.0,2.0,1.0,True,True,False,True,False,False,False,2.0,True,True,,0.0,1.0,0.0,3.0,No,No,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,No,2 - Sales have stayed the same,,,,,,,,
1010,9526a9ca-d7dc-4d1d-8a2a-efff6e6cb882,2022-04-14,2022-04-14,morocco,50-65 years old,I am the owner/founder,True,Tourism and Hospitality,Between 3 and 5 years ago,Between 3 and 10 employees,True,True,True,True,0.0,3.0,1.0,True,True,True,True,False,False,False,3.0,True,False,,0.0,0.0,0.0,1.0,No,No,0.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,3.0,No,3 - I sell a bit more,,,,,,,,
1011,b37b1d36-4f31-4fec-a9e7-344f38673997,2022-04-14,2022-04-14,morocco,20-35 years old,Other,True,Education,Between 3 and 5 years ago,Between 3 and 10 employees,,,,,,,,True,True,True,False,False,False,True,3.0,True,False,,1.0,0.0,1.0,3.0,Yes,No,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,3.0,Yes,1 - I sell a bit less,,,,,,,,


In [281]:
df['sales_change'].value_counts()

3 - I sell a bit more             320
2 - Sales have stayed the same    177
1 - I sell a bit less             143
4 - I sell a lot more              84
0 - I sell a lot less              80
Name: sales_change, dtype: int64

In [278]:
report = fl.describe(filename)
print(report.to_yaml())

path: /Users/albertocottica/Documents/UNDP Master Folder/Informality paper/formality_digitalization_renewed.csv
name: formality_digitalization_renewed
profile: tabular-data-resource
scheme: file
format: csv
hashing: md5
encoding: utf-8
schema:
  fields:
    - type: string
      name: _uuid
    - type: date
      name: start
    - type: date
      name: end
    - type: string
      name: country
    - type: string
      name: age_group
    - type: string
      name: role_in_business
    - type: boolean
      name: female
    - type: string
      name: business_sector
    - type: string
      name: business_age
    - type: string
      name: business_size
    - type: boolean
      name: bank_account
    - type: boolean
      name: business_registration
    - type: boolean
      name: social_security
    - type: boolean
      name: union
    - type: number
      name: formal_none
    - type: number
      name: formalization_level
    - type: number
      name: formal
    - type: boolean
 

In [14]:
df.drop(['nondigitalization_reasons'], axis = 1, inplace = True)
df

Unnamed: 0,_uuid,start,end,country,age_group,role_in_business,female,business_sector,business_age,business_size,bank_account,business_registration,social_security,union,formal_none,formalization_level,formal,digital_tools,whatsapp,facebook,mobile_wallet,grab_uber_jumla,delivery_app,digital_tools_other,digitalization_level,attract_customers,order_materials,process_payments,manage_logistics,purpose_other,digitalization_purpose,covid_digitalization,stopped_using,helped_increase_sales,helped_find_new_customers,helped_find_cheaper_materials,helped_save_time,helped_track_money,helped_reduce_hassles,helped_get_customers,helped_cut_intermediaries,helped_seek_grants,digitalization_help_count,negative_effects,sales_change,expensive,dontknowhowtouse,havent_tried,bad_internet,dont_need,give_information,nondigitalization_reason_other
0,a0dc3688-2144-4ffd-aea8-00e8e672cd47,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Its only me,True,False,False,False,False,1.0,1.0,True,False,True,False,False,False,True,2.0,True,False,False,True,True,3.0,False,False,True,True,False,True,False,False,True,False,True,5.0,No,4 - I sell a lot more,,,,,,,
1,95cf8a5a-950b-417b-b5b6-25f286dd7b01,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,Other,True,Selling food or drinks,Less than 1 year ago,Between 3 and 10 employees,False,False,False,False,,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,False,False,False,1.0,True,False,True,False,False,True,False,False,True,False,False,3.0,I don't know,4 - I sell a lot more,,,,,,,
2,99dcf5ee-06b2-4320-82e7-7d307cc27d77,2022-03-30,2022-03-30,Fiji/ Vanuatu,20-35 years old,I am the owner/founder,True,Other,Between 1 and 2 years ago,Its only me,False,False,False,False,,0.0,0.0,True,True,True,False,False,False,True,3.0,True,True,False,True,False,3.0,True,False,True,True,True,True,False,False,True,False,False,5.0,No,2 - Sales have stayed the same,,,,,,,
3,eec5841b-b060-416f-89f5-f2dfa96d1cf9,2022-03-30,2022-03-30,Ethiopia,20-35 years old,I am the owner/founder,True,Handcrafts,Between 3 and 5 years ago,Less than 3 employees,True,False,False,False,False,1.0,1.0,True,False,True,False,False,False,False,1.0,True,True,False,False,False,2.0,True,False,False,True,False,False,False,False,True,False,False,2.0,Yes,3 - I sell a bit more,,,,,,,
4,d3ce729d-bd66-4763-8884-a3de017eab8a,2022-03-30,2022-03-30,Ethiopia,36-50 years old,I am the owner/founder,True,Tourism and Hospitality,Between 1 and 2 years ago,Between 3 and 10 employees,False,False,False,False,,0.0,0.0,True,False,True,False,False,False,True,2.0,True,False,False,False,False,1.0,True,False,True,True,False,True,False,False,False,False,False,3.0,I don't know,4 - I sell a lot more,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1008,67361469-42f4-4f01-90ed-09505cc1a3aa,2022-04-12,2022-04-12,morocco,20-35 years old,I am the owner/founder,True,Other,Between 2 and 3 years ago,Its only me,True,True,False,False,False,2.0,1.0,True,True,True,False,False,False,True,3.0,True,True,False,False,True,3.0,False,False,True,True,False,True,False,False,True,False,True,5.0,No,3 - I sell a bit more,,,,,,,
1009,bfe5b1ef-4ca1-47dd-a0de-ee7d1e605a92,2022-04-13,2022-04-13,morocco,20-35 years old,I am the owner/founder,True,Manufacturing / Factory,Between 3 and 5 years ago,Between 3 and 10 employees,True,False,True,True,,2.0,1.0,True,True,False,True,False,False,False,2.0,True,True,False,True,False,3.0,False,False,False,False,False,True,False,False,False,False,False,1.0,No,2 - Sales have stayed the same,,,,,,,
1010,9526a9ca-d7dc-4d1d-8a2a-efff6e6cb882,2022-04-14,2022-04-14,morocco,50-65 years old,I am the owner/founder,True,Tourism and Hospitality,Between 3 and 5 years ago,Between 3 and 10 employees,True,True,True,True,False,3.0,1.0,True,True,True,True,False,False,False,3.0,True,False,False,False,False,1.0,False,False,False,True,False,True,False,True,False,False,False,3.0,No,3 - I sell a bit more,,,,,,,
1011,b37b1d36-4f31-4fec-a9e7-344f38673997,2022-04-14,2022-04-14,morocco,20-35 years old,Other,True,Education,Between 3 and 5 years ago,Between 3 and 10 employees,,,,,,,,True,True,True,False,False,False,True,3.0,True,False,True,False,True,3.0,True,False,False,False,True,True,False,True,False,False,False,3.0,Yes,1 - I sell a bit less,,,,,,,
