## CHAPTER 3 PROCESSING, WRANGLING AND DATA VISUALIZATION.
### DATA WRANGLING 
#### Showing the foundamental steps to manipulate data in order to identify the techniques most common used for data wrangling
##### *Jose Ruben Garcia Garcia*
##### *Frebuary 2024*
##### *Reference: Practical Machine learning python problems solver*

### Creating a dataframe

In [1]:
#The first thing we are going to do is to create a dataframe using some libraries

import random
import datetime 
import numpy as np
import pandas as pd
from random import randrange
from sklearn import preprocessing


In [2]:
#This function generates a random date based on params
def _random_date(start,date_count):
    current = start
    while date_count > 0:
        curr = current + datetime.timedelta(days=randrange(42))
        yield curr
        date_count-=1
        

In [3]:
import datetime
import numpy as np
import pandas as pd
import random

#This function generates a random transaction dataset
def generate_sample_data(row_count=1000):
    # sentinels
    startDate = datetime.datetime(2016, 1, 1,13)
    serial_number_sentinel = 1000
    user_id_sentinel = 5001
    product_id_sentinel = 101
    price_sentinel = 2000
    
    
    # base list of attributes
    data_dict = {
    'Serial No': np.arange(row_count)+serial_number_sentinel,
    'Date': np.random.permutation(pd.to_datetime([x.strftime("%m-%d-%Y")  # Cambiado el formato de la fecha aquí
                                                    for x in _random_date(startDate,
                                                                          row_count)]).date
                                  ),
    'User ID': np.random.permutation(np.random.randint(0,
                                                       row_count,
                                                       size=int(row_count/10)) + user_id_sentinel).tolist()*10,
    'Product ID': np.random.permutation(np.random.randint(0,
                                                          row_count,
                                                          size=int(row_count/10))+ product_id_sentinel).tolist()*10 ,
    'Quantity Purchased': np.random.permutation(np.random.randint(1,
                                                                  42,
                                                                  size=row_count)),
    'Price': np.round(np.abs(np.random.randn(row_count)+1)*price_sentinel,
                      decimals=2),
    'User Type':np.random.permutation([chr(random.randrange(97, 97 + 3 + 1)) 
                                            for i in range(row_count)])
    }

    # introduce missing values
    for index in range(int(np.sqrt(row_count))): 
        data_dict['Price'][np.argmax(data_dict['Price'] == random.choice(data_dict['Price']))] = np.nan
        data_dict['User Type'][np.argmax(data_dict['User Type'] == random.choice(data_dict['User Type']))] = np.nan
        data_dict['Date'][np.argmax(data_dict['Date'] == random.choice(data_dict['Date']))] = np.nan
        data_dict['Product ID'][np.argmax(data_dict['Product ID'] == random.choice(data_dict['Product ID']))] = 0
        data_dict['Serial No'][np.argmax(data_dict['Serial No'] == random.choice(data_dict['Serial No']))] = -1
        data_dict['User ID'][np.argmax(data_dict['User ID'] == random.choice(data_dict['User ID']))] = -101
        
    
    # create data frame
    df = pd.DataFrame(data_dict)
    
    return df

# Call the function to generate the DataFrame
df = generate_sample_data()

# Display the DataFrame
df.head()

Unnamed: 0,Serial No,Date,User ID,Product ID,Quantity Purchased,Price,User Type
0,1000,,-101,0,6,,n
1,1001,,5183,200,11,4489.9,n
2,1002,,5065,229,20,2456.06,n
3,1003,,5909,439,32,397.36,n
4,1004,2016-02-04,5560,570,3,1721.89,n


### Understanding data

#### The dataset that was created previously describes transactions having the following attributes/features/properties:
#### • Date: The date of the transaction
#### • Price: The price of the product purchased
#### • Product ID: Product identification number
#### • Quantity Purchased: The quantity of product purchased in this transaction
#### • Serial No: The transaction serial number
#### • User ID: Identification number for user performing the transaction
#### • User Type: The type of user

### Techniques

In [4]:
#Show the number of rows
print("Number of rows::", df.shape[0])

Number of rows:: 1000


In [5]:
#Show the number of columns
print("Number of rows::", df.shape[1])

Number of rows:: 7


In [6]:
#show the names of every column in the dataset
print("Column names::", df.columns.values.tolist())

Column names:: ['Serial No', 'Date', 'User ID', 'Product ID', 'Quantity Purchased', 'Price', 'User Type']


In [7]:
#Show the data type of every column
print("Column data types::\n", df.dtypes )

Column data types::
 Serial No               int64
Date                   object
User ID                 int64
Product ID              int64
Quantity Purchased      int64
Price                 float64
User Type              object
dtype: object


In [8]:
#Look for the columns with missing values in their rows
print("Columns with missing values::",df.columns[df.isnull().any()].tolist())

Columns with missing values:: ['Date', 'Price']


In [9]:
#Count the number of rows with missing values
print("Rows with missing values (count):", len(pd.isnull(df).any(axis=1).to_numpy().nonzero()[0]))

Rows with missing values (count): 58


In [10]:
#Showing the sample indices with missing data
print("Sample indices with missing data:", pd.isnull(df).any(axis=1).to_numpy().nonzero()[0].tolist()[:5])

Sample indices with missing data: [0, 1, 2, 3, 5]


In [11]:
# Getting general info of the dataset
print("General information")
print(df.info())

General information
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 7 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Serial No           1000 non-null   int64  
 1   Date                969 non-null    object 
 2   User ID             1000 non-null   int64  
 3   Product ID          1000 non-null   int64  
 4   Quantity Purchased  1000 non-null   int64  
 5   Price               971 non-null    float64
 6   User Type           1000 non-null   object 
dtypes: float64(1), int64(4), object(2)
memory usage: 54.8+ KB
None


In [12]:
# Getting general stats of the dataset
print("General statistics")
print(df.describe())

General statistics
         Serial No      User ID  Product ID  Quantity Purchased        Price
count  1000.000000  1000.000000  1000.00000          1000.00000   971.000000
mean   1455.285000  5468.379000   636.44400            21.29900  2388.244367
std     386.397338   327.245111   282.85991            11.67081  1644.111601
min      -1.000000  -101.000000     0.00000             1.00000    10.690000
25%    1227.750000  5267.500000   415.25000            11.75000  1081.240000
50%    1484.500000  5453.000000   632.00000            21.00000  2107.010000
75%    1747.250000  5664.250000   892.75000            32.00000  3381.930000
max    1999.000000  5984.000000  1096.00000            41.00000  7665.100000


### Filtering Data

In [13]:
df.head()

Unnamed: 0,Serial No,Date,User ID,Product ID,Quantity Purchased,Price,User Type
0,1000,,-101,0,6,,n
1,1001,,5183,200,11,4489.9,n
2,1002,,5065,229,20,2456.06,n
3,1003,,5909,439,32,397.36,n
4,1004,2016-02-04,5560,570,3,1721.89,n


In [14]:
def cleanup_column_names(df, rename_dict={}, do_inplace=True):
    if not rename_dict:
        df.rename(columns={col: col.lower().replace(' ', '_') for col in df.columns.values.tolist()}, inplace=do_inplace)
    else:
        df.rename(columns=rename_dict, inplace=do_inplace)
    return df

# Verifica si df es un DataFrame válido y no está vacío
if df is not None and not df.empty:
    # Llama a la función cleanup_column_names con tu DataFrame df
    df = cleanup_column_names(df)
    print("Nombres de las columnas del DataFrame df regularizadas correctamente.")
else:
    print("El DataFrame df está vacío o no ha sido inicializado correctamente.")

Nombres de las columnas del DataFrame df regularizadas correctamente.


In [15]:
df.head()

Unnamed: 0,serial_no,date,user_id,product_id,quantity_purchased,price,user_type
0,1000,,-101,0,6,,n
1,1001,,5183,200,11,4489.9,n
2,1002,,5065,229,20,2456.06,n
3,1003,,5909,439,32,397.36,n
4,1004,2016-02-04,5560,570,3,1721.89,n


In [23]:
#Creating a subset of the dataset using the name of the column
print("Using column name::")
print(df.quantity_purchased.values)

Using column name::
[ 6 11 20 32  3 11 27 28 17 15  7 41 24 30 20 16 27  5 35 12 12 35 20 39
 25  5 32 23 34  3 30 34 40 20 33 35  2 16  5 32 20 37 34 36 24  6 28 31
 38  8  6 13 33 23 11 24  3 13 18 38  2 40 33 14 27 29 16  5  6 25 30 32
 11  1  7 26 36 21  2 20 38 28 18 40 37 38 27 21 40  1 25 32  4 33 24 10
 30  6 16 22 34 12  4 11 19 35 40 28 39 33 38  3 39  5 36  1 17 34 14 22
 13 13 13 39 28 32  6 22 16  7 35  5 17 10 23 13 24 40 13 35  4 11 39 15
 21 22 12  2  6 18 39 16 24 34 23  7 41 17 18 28 35 12 17 20 40 20 20  9
 36 35 18 13 24 28 12 21 15 16 37 16 16 36 41 31 22 27  9 18 21 23 12  1
 24  5 24 32 11  7 32  4 30  3 25  6 37  5  9  9 32  9 20 29 38 26 40 19
 10 25 33 30 30 21 26 11 17 11 18  7  8  1 25 10  3 36 25 16 30 34 25 23
 36 31  6  3 19 11 22 20 18 37 22 37  3 29 12 21 12 15 27 31 12  9 12 30
 16 26 41 10 11 26 17 32 34 27  2  6  6  2 23  2 40  5 21 40 28 25 17  1
 32  4 39 14 29 17 17 34 22 37 20 20 34 11  5  6 34 25 11 40 34 16 39 24
 28 10 22 38 18  1  7 38  8 33 

In [24]:
#splitting the dataframe by row indices
print("Select specific row indices::")
print(df.iloc[[10,501,20]])

Select specific row indices::
     serial_no        date  user_id  product_id  quantity_purchased    price  \
10        1010         NaN     5467         947                   7  2770.73   
501       1501  2016-01-24     5183         200                  41  1991.38   
20        1020         NaN     5254         124                  12  3406.04   

    user_type  
10          n  
501         d  
20          n  


In [25]:
print("excluding specific row indices::")
print(df.drop([0,24,51], axis = 0).head())

excluding specific row indices::
   serial_no        date  user_id  product_id  quantity_purchased    price  \
1       1001         NaN     5183         200                  11  4489.90   
2       1002         NaN     5065         229                  20  2456.06   
3       1003         NaN     5909         439                  32   397.36   
4       1004  2016-02-04     5560         570                   3  1721.89   
5       1005         NaN     5104         556                  11  2243.98   

  user_type  
1         n  
2         n  
3         n  
4         n  
5         n  


In [26]:
#creating a set based in logical conditions.
print("Creating a subset based on logical conditions(s)::")
print(df[df.quantity_purchased>25].head())

Creating a subset based on logical conditions(s)::
    serial_no        date  user_id  product_id  quantity_purchased    price  \
3        1003         NaN     5909         439                  32   397.36   
6        1006         NaN     5498         452                  27  2434.05   
7        1007  2016-01-29     5258         677                  28  5303.69   
11       1011         NaN     5338         848                  41  1307.42   
13       1013  2016-01-20     5838         957                  30  3882.12   

   user_type  
3          n  
6          n  
7          n  
11         n  
13         n  


In [28]:
print("Subsetting based on offset from top (bottom)")
print(df[100:].head(), df.tail(-100) )

Subsetting based on offset from top (bottom)
     serial_no        date  user_id  product_id  quantity_purchased    price  \
100       1100  2016-01-14     5200         256                  34  2614.96   
101       1101  2016-02-01     5183         200                  12  1653.19   
102       1102  2016-01-13     5065         229                   4  2343.58   
103       1103  2016-01-17     5909         439                  11   966.36   
104       1104  2016-02-08     5560         570                  19  3964.39   

    user_type  
100         a  
101         a  
102         d  
103         c  
104         c        serial_no        date  user_id  product_id  quantity_purchased    price  \
100       1100  2016-01-14     5200         256                  34  2614.96   
101       1101  2016-02-01     5183         200                  12  1653.19   
102       1102  2016-01-13     5065         229                   4  2343.58   
103       1103  2016-01-17     5909         439           

### Typecasting

In [29]:
#converting the date column into another format 
df['date'] = pd.to_datetime(df.date)
print(df.dtypes)

serial_no                      int64
date                  datetime64[ns]
user_id                        int64
product_id                     int64
quantity_purchased             int64
price                        float64
user_type                     object
dtype: object


### Transformations

In [30]:
#transform existing columns or derive nw atributes based on requirements
"""Maps user types to user classes."""
def expand_user_type(u_type):
    if u_type in ['a','b']:
        return 'new'
    elif u_type == 'c':
        return 'existing'
    elif u_type == 'd':
        return 'loyal_existing'
    else:
        return 'error'

df['user_class'] = df['user_type'].map(expand_user_type)

In [34]:
#Adding a new column to our dataset extracting the week from the date column uisng a lamdba function
df['purchase_week'] = df[['date']].applymap(lambda dt:dt.week
                                            if not pd.isnull(dt.week)
                                            else 0)
df.head()

  df['purchase_week'] = df[['date']].applymap(lambda dt:dt.week


Unnamed: 0,serial_no,date,user_id,product_id,quantity_purchased,price,user_type,user_class,purchase_week
0,1000,NaT,-101,0,6,,n,error,0
1,1001,NaT,5183,200,11,4489.9,n,error,0
2,1002,NaT,5065,229,20,2456.06,n,error,0
3,1003,NaT,5909,439,32,397.36,n,error,0
4,1004,2016-02-04,5560,570,3,1721.89,n,error,5


In [35]:
#getting the max and min value for all numeric atributes
df.select_dtypes(include=[np.number]).apply(lambda x: x.max() - x.min())

serial_no             2000.00
user_id               6085.00
product_id            1096.00
quantity_purchased      40.00
price                 7654.41
purchase_week           53.00
dtype: float64

In [36]:
#imputing missing values
print("drop Rows with missing dates::")
df_dropped = df.dropna(subset=['date'])
print("Shape::",df_dropped.shape)

drop Rows with missing dates::
Shape:: (969, 9)


In [37]:
#Fill missing price with the mean of the price column
print("Fill missing price values with mean price::")
df_dropped['price'].fillna(value=np.round(df.price.mean(),decimals=2),inplace=True)
df_dropped

Fill missing price values with mean price::


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_dropped['price'].fillna(value=np.round(df.price.mean(),decimals=2),inplace=True)


Unnamed: 0,serial_no,date,user_id,product_id,quantity_purchased,price,user_type,user_class,purchase_week
4,1004,2016-02-04,5560,570,3,1721.89,n,error,5
7,1007,2016-01-29,5258,677,28,5303.69,n,error,4
9,1009,2016-01-10,5631,861,15,384.21,n,error,1
12,1012,2016-01-20,5076,370,24,2627.57,n,error,3
13,1013,2016-01-20,5838,957,30,3882.12,n,error,3
...,...,...,...,...,...,...,...,...,...
995,1995,2016-02-07,5773,710,27,2020.02,b,new,5
996,1996,2016-01-15,5545,627,3,397.91,b,new,2
997,1997,2016-01-26,5345,451,38,4588.08,c,existing,4
998,1998,2016-01-17,5837,922,11,2203.14,b,new,2


In [41]:
#Fill missing user_type in other ways. 
print("Fill Missing user_type values with value from previous row (forward fill) ::")
df_dropped['user_type'] = df_dropped['user_type'].ffill()

Fill Missing user_type values with value from previous row (forward fill) ::


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_dropped['user_type'] = df_dropped['user_type'].ffill()


In [40]:
print("Fill Missing user_type values with value from \
            next row (backward fill) ::")
df_dropped['user_type'].fillna(method='bfill',inplace=True)

Fill Missing user_type values with value from             next row (backward fill) ::


  df_dropped['user_type'].fillna(method='bfill',inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_dropped['user_type'].fillna(method='bfill',inplace=True)


In [42]:
# Fill Missing user_type values with value from next row (backward fill)
print("Fill Missing user_type values with value from next row (backward fill) ::")
df_dropped['user_type'] = df_dropped['user_type'].bfill()

Fill Missing user_type values with value from next row (backward fill) ::


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_dropped['user_type'] = df_dropped['user_type'].bfill()


### Duplicates

In [45]:
#Identifying and deleting duplicates
df_dropped[df_dropped.duplicated(subset=['serial_no'])]
df_dropped.drop_duplicates(subset=['serial_no'],inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_dropped.drop_duplicates(subset=['serial_no'],inplace=True)


In [49]:
df.head()

Unnamed: 0,serial_no,date,user_id,product_id,quantity_purchased,price,user_type,user_class,purchase_week
0,1000,NaT,-101,0,6,,n,error,0
1,1001,NaT,5183,200,11,4489.9,n,error,0
2,1002,NaT,5065,229,20,2456.06,n,error,0
3,1003,NaT,5909,439,32,397.36,n,error,0
4,1004,2016-02-04,5560,570,3,1721.89,n,error,5


In [46]:
df_dropped.head()

Unnamed: 0,serial_no,date,user_id,product_id,quantity_purchased,price,user_type,user_class,purchase_week
4,1004,2016-02-04,5560,570,3,1721.89,n,error,5
7,1007,2016-01-29,5258,677,28,5303.69,n,error,4
9,1009,2016-01-10,5631,861,15,384.21,n,error,1
12,1012,2016-01-20,5076,370,24,2627.57,n,error,3
13,1013,2016-01-20,5838,957,30,3882.12,n,error,3


In [54]:
#categorical data and how to encode it.
print("*"*30)
print("Encoding Categorical Variables")
print("*"*30)
print(pd.get_dummies(df,columns=['user_type']).head())
print("\n")

type_map={'a':0,'b':1,'c':2,'d':3,np.NAN:-1}
df['encoded_user_type'] = df.user_type.map(type_map)
print((df.head()))
print("\n")

******************************
Encoding Categorical Variables
******************************
   serial_no       date  user_id  product_id  quantity_purchased    price  \
0       1000        NaT     -101           0                   6      NaN   
1       1001        NaT     5183         200                  11  4489.90   
2       1002        NaT     5065         229                  20  2456.06   
3       1003        NaT     5909         439                  32   397.36   
4       1004 2016-02-04     5560         570                   3  1721.89   

  user_class  purchase_week  encoded_user_type  user_type_a  user_type_b  \
0      error              0                NaN        False        False   
1      error              0                NaN        False        False   
2      error              0                NaN        False        False   
3      error              0                NaN        False        False   
4      error              5                NaN        False     

In [53]:
df.tail()

Unnamed: 0,serial_no,date,user_id,product_id,quantity_purchased,price,user_type,user_class,purchase_week,encoded_user_type
995,1995,2016-02-07,5773,710,27,2020.02,b,new,5,1.0
996,1996,2016-01-15,5545,627,3,397.91,b,new,2,1.0
997,1997,2016-01-26,5345,451,38,4588.08,c,existing,4,2.0
998,1998,2016-01-17,5837,922,11,2203.14,b,new,2,1.0
999,1999,2016-02-06,5597,739,12,3366.29,d,loyal_existing,5,3.0


### Normalizing values

In [57]:
#Normalizing the values using the minmaxscaler function
print("*"*30)
print("Normalizing Numeric Data")
print("*"*30)

print("Min-Max Scaler::")
df_normalized = df.dropna().copy()
min_max_scaler = preprocessing.MinMaxScaler()
np_scaled = min_max_scaler.fit_transform(df_normalized['price'].values.reshape(-1,1))
df_normalized['price'] = np_scaled.reshape(-1,1)
print(df_normalized.head())
print("\n")

******************************
Normalizing Numeric Data
******************************
Min-Max Scaler::
    serial_no       date  user_id  product_id  quantity_purchased     price  \
16       1016 2016-01-19     5301         478                  27  0.125191   
31       1031 2016-01-28     5900         402                  34  0.289612   
35       1035 2016-01-09     5308         328                  35  0.291780   
36       1036 2016-01-03     5408         627                   2  0.763459   
37         -1 2016-02-10     5097         588                  16  0.168622   

   user_type      user_class  purchase_week  encoded_user_type  
16         d  loyal_existing              3                3.0  
31         d  loyal_existing              4                3.0  
35         b             new              1                1.0  
36         d  loyal_existing             53                3.0  
37         d  loyal_existing              6                3.0  




### Data sumarization

In [58]:
print("Aggregates based on condition::")
print(df['price'][df['user_type']=='a'].mean())
print("\n")

Aggregates based on condition::
2200.4495454545454




In [59]:
print("Row Counts on condition::")
print(df['purchase_week'].value_counts())
print("\n")

Row Counts on condition::
purchase_week
5     192
3     161
1     156
2     154
4     140
6      94
53     72
0      31
Name: count, dtype: int64




In [60]:
print("GroupBy attributes::")
print(df.groupby(['user_class'])['quantity_purchased'].sum())
print("\n")

GroupBy attributes::
user_class
error               636
existing           5086
loyal_existing     5089
new               10488
Name: quantity_purchased, dtype: int64




In [61]:
print("GroupBy with different aggregates::")
print(df.groupby(['user_class'])['quantity_purchased'].agg([np.sum,
                                                            np.mean,
                                                            np.count_nonzero]))
print("\n")  

GroupBy with different aggregates::
                  sum       mean  count_nonzero
user_class                                     
error             636  21.200000             30
existing         5086  20.930041            243
loyal_existing   5089  21.935345            232
new             10488  21.187879            495




  print(df.groupby(['user_class'])['quantity_purchased'].agg([np.sum,
  print(df.groupby(['user_class'])['quantity_purchased'].agg([np.sum,


In [63]:
print("GroupBy with specific agg for each attribute::")
print(df.groupby(['user_class','user_type']).agg({'price':np.mean,
                                                    'quantity_purchased':np.max}))
print("\n")

GroupBy with specific agg for each attribute::
                                price  quantity_purchased
user_class     user_type                                 
error          n          2423.013448                  41
existing       c          2526.547149                  41
loyal_existing d          2479.323921                  41
new            a          2200.449545                  41
               b          2338.745692                  41




  print(df.groupby(['user_class','user_type']).agg({'price':np.mean,
  print(df.groupby(['user_class','user_type']).agg({'price':np.mean,


In [66]:
print("Pivot tables::")
print(df.pivot_table(index='date', columns='user_type', 
                     values='price',aggfunc=np.mean))
print("\n")      

print("Stacking::")
print(df.stack())
print("\n")   

Pivot tables::
user_type             a            b            c            d            n
date                                                                       
2016-01-01  1486.523333  2632.640000  1593.625000  3457.932500          NaN
2016-01-02  2588.358000  1613.865714  2999.466364  3661.312500          NaN
2016-01-03  2294.368571  1854.276364  3360.590000  3465.101429   837.540000
2016-01-04  2255.388571  2652.676250  2684.996667  1534.075714          NaN
2016-01-05  1966.822857  1092.540000  3303.923333  1514.975000  2103.840000
2016-01-06  1817.531429  2989.940000  3117.361667  2204.515714          NaN
2016-01-07  2479.020000  1897.362222  1901.205000  2541.292000          NaN
2016-01-08  2895.760000  2309.982222  2258.340000  2117.576000  2457.016667
2016-01-09  2599.073333  2527.785000  3301.696667  2794.904000  1360.070000
2016-01-10  1902.797500  3150.995000  3677.375000  2954.040000   384.210000
2016-01-11  3074.250000  1828.173333  2902.305455  3300.478000  3282.4000

  print(df.pivot_table(index='date', columns='user_type',
