# California PO Data Analysis for 2012-2015 Spend

In [1]:
# import the modules I'll be leveraging in the analysis
import pandas as pd
import numpy as np
import datetime as dt
import math

In [2]:
# reads csv file of PO data and creates dataframe object, "po_df"
po_df = pd.read_csv('PURCHASE ORDER DATA EXTRACT 2012-2015_0.csv')

Now that the csv file has been imported, let's take a look at the first ten rows.

In [3]:
po_df.head(10)

Unnamed: 0,Creation Date,Purchase Date,Fiscal Year,LPA Number,Purchase Order Number,Requisition Number,Acquisition Type,Sub-Acquisition Type,Acquisition Method,Sub-Acquisition Method,...,Classification Codes,Normalized UNSPSC,Commodity Title,Class,Class Title,Family,Family Title,Segment,Segment Title,Location
0,08/27/2013,,2013-2014,7-12-70-26,REQ0011118,REQ0011118,IT Goods,,WSCA/Coop,,...,,,,,,,,,,
1,01/29/2014,,2013-2014,,REQ0011932,REQ0011932,NON-IT Goods,,Informal Competitive,,...,76121504.0,76121504.0,,,,,,,,
2,11/01/2013,,2013-2014,,REQ0011476,REQ0011476,IT Services,,Informal Competitive,,...,,,,,,,,,,"95841\n(38.662263, -121.346136)"
3,06/13/2014,06/05/2014,2013-2014,,4500236642,,NON-IT Goods,,Informal Competitive,,...,,,,,,,,,,"91436\n(34.151642, -118.49051)"
4,03/12/2014,03/12/2014,2013-2014,1-10-75-60A,4500221028,,NON-IT Goods,,Statewide Contract,,...,44103127.0,44103127.0,,,,,,,,"95814\n(38.580427, -121.494396)"
5,10/09/2014,10/01/2014,2014-2015,,4500253427,,NON-IT Goods,,Informal Competitive,,...,,,,,,,,,,"97008\n(45.460518, -122.806409)"
6,10/10/2014,,2014-2015,1-14-75-60A,REQ0013911,REQ0013911,NON-IT Goods,,Statewide Contract,,...,44103127.0,44103127.0,,,,,,,,"95814\n(38.580427, -121.494396)"
7,04/24/2014,04/14/2014,2013-2014,,12-64006.01,,NON-IT Services,Personal Services,Services are specifically exempt by statute,,...,85121615.0,85121615.0,,,,,,,,"93274\n(36.193481, -119.358379)"
8,02/06/2015,,2014-2015,1-14-75-60A,REQ0014515,REQ0014515,NON-IT Goods,,Statewide Contract,,...,44103127.0,44103127.0,,,,,,,,"95814\n(38.580427, -121.494396)"
9,08/14/2013,07/26/2013,2013-2014,,4500200308,,NON-IT Goods,,Informal Competitive,,...,401728.0,401728.0,,,,,,,,"91322\n(34.379263, -118.547301)"


Viewing the first few rows, it appears there are a lot of null values. Let's now gather some information on the dataframe to see if these null values will need to be replaced or rows/columns will need to be dropped.

In [4]:
# returns information on dataframe, "po_df"
po_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 346018 entries, 0 to 346017
Data columns (total 31 columns):
 #   Column                   Non-Null Count   Dtype  
---  ------                   --------------   -----  
 0   Creation Date            346018 non-null  object 
 1   Purchase Date            328582 non-null  object 
 2   Fiscal Year              346018 non-null  object 
 3   LPA Number               92345 non-null   object 
 4   Purchase Order Number    346018 non-null  object 
 5   Requisition Number       14369 non-null   object 
 6   Acquisition Type         346018 non-null  object 
 7   Sub-Acquisition Type     68337 non-null   object 
 8   Acquisition Method       346018 non-null  object 
 9   Sub-Acquisition Method   30896 non-null   object 
 10  Department Name          346018 non-null  object 
 11  Supplier Code            345982 non-null  float64
 12  Supplier Name            345982 non-null  object 
 13  Supplier Qualifications  141745 non-null  object 
 14  Supp

Based on the information above, there are number of columns that are superfluous for the analysis. I'm going to remove a number of columns in the following lines of code.

Most of the features are in the correct data type for the analysis. However, a few important fields, 'Unit Price', 'Total Price', 'Creation Date', and 'Purchase Date', are of the incorrect data types. 'Unit Price' and 'Total Price' should be floats. 'Creation Date' and 'Purchase Date' are objects, not strings or date/time objects. The following lines of code will address these issues:

In [5]:
# removes uneeded features from the dataframe
po_df = po_df[['Creation Date','Purchase Date','Acquisition Type','Acquisition Method',
              'Department Name','Supplier Name','Item Name','Quantity','Unit Price',
               'Total Price','Commodity Title']]

# returns first five records of the dataframe
po_df.head()

Unnamed: 0,Creation Date,Purchase Date,Acquisition Type,Acquisition Method,Department Name,Supplier Name,Item Name,Quantity,Unit Price,Total Price,Commodity Title
0,08/27/2013,,IT Goods,WSCA/Coop,"Consumer Affairs, Department of",Pitney Bowes,USB,1.0,$1.00,$1.00,
1,01/29/2014,,NON-IT Goods,Informal Competitive,"Consumer Affairs, Department of",Rodea Auto Tech,Tire Disposal,2.0,$2.00,$4.00,
2,11/01/2013,,IT Services,Informal Competitive,"Consumer Affairs, Department of","Smile Business Products, Inc",Labor,4.5,$150.00,$675.00,
3,06/13/2014,06/05/2014,NON-IT Goods,Informal Competitive,Correctional Health Care Services,ASHAN INC,,,,,
4,03/12/2014,03/12/2014,NON-IT Goods,Statewide Contract,"Corrections and Rehabilitation, Department of",Technology Integration Group,Toner,1.0,$6080.26,$6080.26,


First, let's begin the data cleanup process by addressing the 'Unit Price' and 'Total Price' features. As these features are in the incorrect format for analysis, I will create a function to apply to the 'Unit Price' series. Following its application, I will redefine the feature 'Total Price'.

In [6]:
# function for converting feature 'Unit Price' from string to float data type
def price_correction(unit_price):
    unit_price = str(unit_price)
    if unit_price is not None and (unit_price[0] == '$'):
        return float(unit_price[1:])

In [7]:
# applies function 'price_correction' created above to 'Unit Price' feature
po_df['Unit Price'] = po_df['Unit Price'].apply(price_correction)

In [8]:
po_df.head()

Unnamed: 0,Creation Date,Purchase Date,Acquisition Type,Acquisition Method,Department Name,Supplier Name,Item Name,Quantity,Unit Price,Total Price,Commodity Title
0,08/27/2013,,IT Goods,WSCA/Coop,"Consumer Affairs, Department of",Pitney Bowes,USB,1.0,1.0,$1.00,
1,01/29/2014,,NON-IT Goods,Informal Competitive,"Consumer Affairs, Department of",Rodea Auto Tech,Tire Disposal,2.0,2.0,$4.00,
2,11/01/2013,,IT Services,Informal Competitive,"Consumer Affairs, Department of","Smile Business Products, Inc",Labor,4.5,150.0,$675.00,
3,06/13/2014,06/05/2014,NON-IT Goods,Informal Competitive,Correctional Health Care Services,ASHAN INC,,,,,
4,03/12/2014,03/12/2014,NON-IT Goods,Statewide Contract,"Corrections and Rehabilitation, Department of",Technology Integration Group,Toner,1.0,6080.26,$6080.26,


Now, let's redefine the a column for the total price, which is 'Quantity' x 'Unit Price', to be of the correct data type:

In [9]:
# changes feature 'Total Price' in place to a float data type
po_df['Total Price'] = po_df['Quantity'] * po_df['Unit Price']

po_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 346018 entries, 0 to 346017
Data columns (total 11 columns):
 #   Column              Non-Null Count   Dtype  
---  ------              --------------   -----  
 0   Creation Date       346018 non-null  object 
 1   Purchase Date       328582 non-null  object 
 2   Acquisition Type    346018 non-null  object 
 3   Acquisition Method  346018 non-null  object 
 4   Department Name     346018 non-null  object 
 5   Supplier Name       345982 non-null  object 
 6   Item Name           345987 non-null  object 
 7   Quantity            345988 non-null  float64
 8   Unit Price          345988 non-null  float64
 9   Total Price         345988 non-null  float64
 10  Commodity Title     342723 non-null  object 
dtypes: float64(3), object(8)
memory usage: 29.0+ MB


Viewing the dataframe's information, it's evident that quite a few rows possess null values for quantity and/or unit price. As this analysis is focused on spend, these entries are meaningless without tangible numbers. However, before dropping these records, we need to determine if the null values are intentional. For example, 'Quantity' and/or 'Unit Price' may appear null for service-related purchases as this spend is amount based.

Let's identify null values by creating two new columns - 'Quantity Null?' and 'Unit Price Null?' - which will inform us if null value are present.

In [10]:
# creates two new columns in the dataframe
po_df['Quantity Null?'] = np.where(po_df['Quantity'].isnull(), 'Yes', 'No')
po_df['Unit Price Null?'] = np.where(po_df['Unit Price'].isnull(), 'Yes', 'No')

In [11]:
# returns selects columns of the first 5 rows
po_df[['Unit Price','Quantity','Quantity Null?','Unit Price Null?']].head()

Unnamed: 0,Unit Price,Quantity,Quantity Null?,Unit Price Null?
0,1.0,1.0,No,No
1,2.0,2.0,No,No
2,150.0,4.5,No,No
3,,,Yes,Yes
4,6080.26,1.0,No,No


In [12]:
# groups dataframe by 'Acquisition Type' and 'Quantity Null?' and returns the count for each
po_df.groupby(['Acquisition Type','Quantity Null?'])['Acquisition Type'].count()

Acquisition Type       Quantity Null?
IT Goods               No                 50900
IT Services            No                 11516
IT Telecommunications  No                   147
NON-IT Goods           No                215053
                       Yes                   30
NON-IT Services        No                 68372
Name: Acquisition Type, dtype: int64

In [13]:
# groups dataframe by 'Acquisition Type' and 'Unit Price Null?' and returns the count for each
po_df.groupby(['Acquisition Type','Unit Price Null?'])['Acquisition Type'].count()

Acquisition Type       Unit Price Null?
IT Goods               No                   50900
IT Services            No                   11516
IT Telecommunications  No                     147
NON-IT Goods           No                  215053
                       Yes                     30
NON-IT Services        No                   68372
Name: Acquisition Type, dtype: int64

Viewing both groupby objects above, it's evident that null values are likely unintentional. The only null values for quantities and unit prices appear for records in which the acquisition type is 'Non-IT Goods'. My assumption that null values appear for service-related purchases was incorrect. 

We can conclude service purchases include both a unit price and quantity like physical goods. Therefore, I will remove all records where one or both fields are null.

In [14]:
# creates boolean index for both features
x = po_df['Quantity Null?'] == 'No'
y = po_df['Unit Price Null?'] == 'No'

# removes records in which the boolean index is 'False'
po_df = po_df[x]
po_df = po_df[y]

  po_df = po_df[y]


Let's now cleanup the following two features: 'Purchase Date' and 'Unit Price'.

In viewing the fields' values, I believe using the average difference between non-null 'Purchase Date' and 'Creation Date' fields will give a good estimate on what the null 'Purchase Date' values should be. The equation will be as follows:

               Purchase Date (for NaN) = Creation Date - Avg. Difference between Creation & Purchase Dates

In [15]:
# converts feature 'Creation Date' to pandas datetime object
po_df['Creation Date'] = pd.to_datetime(po_df['Creation Date'])

In [16]:
# converts feature 'Purchase Date' to pandas datetime object
po_df['Purchase Date'] = pd.to_datetime(po_df['Purchase Date'], errors = 'coerce')

In [17]:
# creates new column called 'Days Elapsed' by taking the difference between the 'Creation Date' and 'Purchase Date'
po_df['Days Elapsed'] = (po_df['Creation Date'] - po_df['Purchase Date']).dt.days

In [18]:
po_df.head()

Unnamed: 0,Creation Date,Purchase Date,Acquisition Type,Acquisition Method,Department Name,Supplier Name,Item Name,Quantity,Unit Price,Total Price,Commodity Title,Quantity Null?,Unit Price Null?,Days Elapsed
0,2013-08-27,NaT,IT Goods,WSCA/Coop,"Consumer Affairs, Department of",Pitney Bowes,USB,1.0,1.0,1.0,,No,No,
1,2014-01-29,NaT,NON-IT Goods,Informal Competitive,"Consumer Affairs, Department of",Rodea Auto Tech,Tire Disposal,2.0,2.0,4.0,,No,No,
2,2013-11-01,NaT,IT Services,Informal Competitive,"Consumer Affairs, Department of","Smile Business Products, Inc",Labor,4.5,150.0,675.0,,No,No,
4,2014-03-12,2014-03-12,NON-IT Goods,Statewide Contract,"Corrections and Rehabilitation, Department of",Technology Integration Group,Toner,1.0,6080.26,6080.26,,No,No,0.0
6,2014-10-10,NaT,NON-IT Goods,Statewide Contract,"Consumer Affairs, Department of",Technology Integration Group,HP 35A BLACK TONER,30.0,45.4,1362.0,,No,No,


In [19]:
# determines the average for the feature 'Days Elapsed'
avg_days_elapsed = math.floor(po_df['Days Elapsed'].mean())
avg_days_elapsed

62

In [20]:
# if 'Purchase Date' is NaN, assign it the difference between the record's 'Creation Date' and avg days elapsed
# else, return the 'Purchase Date'
po_df['Purchase Date'] = np.where(po_df['Purchase Date'].isnull(), 
                                  (po_df['Creation Date'] - dt.timedelta(avg_days_elapsed)), 
                                  po_df['Purchase Date'])

In [21]:
po_df.head(5)

Unnamed: 0,Creation Date,Purchase Date,Acquisition Type,Acquisition Method,Department Name,Supplier Name,Item Name,Quantity,Unit Price,Total Price,Commodity Title,Quantity Null?,Unit Price Null?,Days Elapsed
0,2013-08-27,2013-06-26,IT Goods,WSCA/Coop,"Consumer Affairs, Department of",Pitney Bowes,USB,1.0,1.0,1.0,,No,No,
1,2014-01-29,2013-11-28,NON-IT Goods,Informal Competitive,"Consumer Affairs, Department of",Rodea Auto Tech,Tire Disposal,2.0,2.0,4.0,,No,No,
2,2013-11-01,2013-08-31,IT Services,Informal Competitive,"Consumer Affairs, Department of","Smile Business Products, Inc",Labor,4.5,150.0,675.0,,No,No,
4,2014-03-12,2014-03-12,NON-IT Goods,Statewide Contract,"Corrections and Rehabilitation, Department of",Technology Integration Group,Toner,1.0,6080.26,6080.26,,No,No,0.0
6,2014-10-10,2014-08-09,NON-IT Goods,Statewide Contract,"Consumer Affairs, Department of",Technology Integration Group,HP 35A BLACK TONER,30.0,45.4,1362.0,,No,No,


According to the data dictionary:

    "The State Contract and Procurement Registration System (SCPRS) was established in 2003, as a centralized database of 
    information on State contracts and purchases over $5000. eSCPRS represents the data captured in the State's 
    eProcurement (eP) system, Bidsync, as of March 16, 2009. The data provided is an extract from that system for fiscal 
    years 2012-2013, 2013-2014, and 2014-2015."

As the data represents purchases from the years 2012 to 2015, any purchases outside of that range were likely inputted incorrectly. Consider the following record in which the 'Purchase Date' is recorded as having occurred in the year 1912. Obviously this is an error:

In [22]:
po_df.loc[736]

Creation Date                       2012-08-08 00:00:00
Purchase Date                       1912-08-08 00:00:00
Acquisition Type                           NON-IT Goods
Acquisition Method                 Informal Competitive
Department Name       Parks & Recreation, Department of
Supplier Name              Hayward Lumber & Home Supply
Item Name                                         Nails
Quantity                                            1.0
Unit Price                                        64.98
Total Price                                       64.98
Commodity Title                               Cap nails
Quantity Null?                                       No
Unit Price Null?                                     No
Days Elapsed                                    36525.0
Name: 736, dtype: object

Therefore, for any 'Purchase Date' that is outside of the range [2012, 2015], I will replace its value with its 'Creation Date'.

In [23]:
# corrects the instances in which the 'Purchase Date' is outside of the designated range by 
# reassigning it its 'Creation Date'
def correct_date(date):
    if date['Purchase Date'].year > 2015 or date['Purchase Date'].year < 2012:
        return date['Creation Date']
    else:
        return date['Purchase Date']

In [24]:
# applies the function 'correct_date' to the dataframe series 'Purchase Date'
po_df['Purchase Date'] = po_df.apply(correct_date, axis = 1)

If we look at index 736 again, we can see that function worked as expected by replacing the original 'Purchase Date' with
its 'Creation Date':

In [25]:
po_df.loc[736]

Creation Date                       2012-08-08 00:00:00
Purchase Date                       2012-08-08 00:00:00
Acquisition Type                           NON-IT Goods
Acquisition Method                 Informal Competitive
Department Name       Parks & Recreation, Department of
Supplier Name              Hayward Lumber & Home Supply
Item Name                                         Nails
Quantity                                            1.0
Unit Price                                        64.98
Total Price                                       64.98
Commodity Title                               Cap nails
Quantity Null?                                       No
Unit Price Null?                                     No
Days Elapsed                                    36525.0
Name: 736, dtype: object

For null values in the fields 'Supplier Name' and 'Item Name', let's address them as follows:

    - 'Supplier Name': replace NaN values with the string 'UNKNOWN'
    - 'Item Name': replace NaN values with each record's corresponding 'Commodity Title'. This represents the best guess as 
       to what the item's name likely is.

In [26]:
# boolean index to identify records where 'Supplier Name' is NaN
x = po_df['Supplier Name'].isnull()

# replaces null supplier names with 'UNKNOWN'
po_df['Supplier Name'] = np.where(po_df['Supplier Name'].isnull(), 'UNKNOWN', po_df['Supplier Name'])

For the records with a null commodity title, we will replace their values with their corresponding item names. The code is as follows:

In [27]:
po_df['Commodity Title'] = np.where(po_df['Commodity Title'].isnull(), po_df['Item Name'], po_df['Commodity Title'])

In [28]:
po_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 345988 entries, 0 to 346017
Data columns (total 14 columns):
 #   Column              Non-Null Count   Dtype         
---  ------              --------------   -----         
 0   Creation Date       345988 non-null  datetime64[ns]
 1   Purchase Date       345988 non-null  datetime64[ns]
 2   Acquisition Type    345988 non-null  object        
 3   Acquisition Method  345988 non-null  object        
 4   Department Name     345988 non-null  object        
 5   Supplier Name       345988 non-null  object        
 6   Item Name           345987 non-null  object        
 7   Quantity            345988 non-null  float64       
 8   Unit Price          345988 non-null  float64       
 9   Total Price         345988 non-null  float64       
 10  Commodity Title     345988 non-null  object        
 11  Quantity Null?      345988 non-null  object        
 12  Unit Price Null?    345988 non-null  object        
 13  Days Elapsed        328526 no

The dataframe's information above shows that there exists one item with a null name value. Let's clean up this final value before beginning the analysis

In [35]:
po_df[po_df['Item Name'].isnull()]

Unnamed: 0,Creation Date,Purchase Date,Acquisition Type,Acquisition Method,Department Name,Supplier Name,Item Name,Quantity,Unit Price,Total Price,Commodity Title,Days Elapsed,Quantity Null?,Unit Price Null?
209749,2015-03-18,2014-07-01,IT SERVICES,Statewide Contract,CORRECTIONAL HEALTH CARE SERVICES,"SMILE BUSINESS PRODUCTS, INC",,1.0,0.0,0.0,General office equipment maintenance,260.0,No,No


For this single record, let's assign the item the name, 'Equipment Maintenance'.

In [29]:
# df.loc locates the actual index of the record
po_df.loc[209749, 'Item Name'] = 'Equipment Maintenance'

Now that we have addressed features with null and incorrect values present, let's adjust the formatting of some of the  features in the dataframe. 

In [30]:
# function to capitalize series of strings
def all_caps(name):
    return name.upper()

In [33]:
# applies formatting function 'all_caps' to a number of the dataframe's features
po_df['Supplier Name'] = po_df['Supplier Name'].apply(all_caps)
po_df['Acquisition Type'] = po_df['Acquisition Type'].apply(all_caps)
po_df['Department Name'] = po_df['Department Name'].apply(all_caps)
po_df['Commodity Title'] = po_df['Commodity Title'].apply(all_caps)
po_df['Item Name']  = po_df['Item Name'].apply(all_caps)
po_df.head()

Unnamed: 0,Creation Date,Purchase Date,Acquisition Type,Acquisition Method,Department Name,Supplier Name,Item Name,Quantity,Unit Price,Total Price,Commodity Title,Quantity Null?,Unit Price Null?,Days Elapsed
0,2013-08-27,2013-06-26,IT GOODS,WSCA/Coop,"CONSUMER AFFAIRS, DEPARTMENT OF",PITNEY BOWES,USB,1.0,1.0,1.0,USB,No,No,
1,2014-01-29,2013-11-28,NON-IT GOODS,Informal Competitive,"CONSUMER AFFAIRS, DEPARTMENT OF",RODEA AUTO TECH,TIRE DISPOSAL,2.0,2.0,4.0,TIRE DISPOSAL,No,No,
2,2013-11-01,2013-08-31,IT SERVICES,Informal Competitive,"CONSUMER AFFAIRS, DEPARTMENT OF","SMILE BUSINESS PRODUCTS, INC",LABOR,4.5,150.0,675.0,LABOR,No,No,
4,2014-03-12,2014-03-12,NON-IT GOODS,Statewide Contract,"CORRECTIONS AND REHABILITATION, DEPARTMENT OF",TECHNOLOGY INTEGRATION GROUP,TONER,1.0,6080.26,6080.26,TONER,No,No,0.0
6,2014-10-10,2014-08-09,NON-IT GOODS,Statewide Contract,"CONSUMER AFFAIRS, DEPARTMENT OF",TECHNOLOGY INTEGRATION GROUP,HP 35A BLACK TONER,30.0,45.4,1362.0,HP 35A BLACK TONER,No,No,


In [34]:
po_df.drop(labels = ['Quantity Null?','Unit Price Null?','Days Elapsed'], axis = 1, inplace = True)

In [35]:
po_df['Days Elapsed'] = (po_df['Creation Date'] - po_df['Purchase Date']).dt.days

In [36]:
po_df.head()

Unnamed: 0,Creation Date,Purchase Date,Acquisition Type,Acquisition Method,Department Name,Supplier Name,Item Name,Quantity,Unit Price,Total Price,Commodity Title,Days Elapsed
0,2013-08-27,2013-06-26,IT GOODS,WSCA/Coop,"CONSUMER AFFAIRS, DEPARTMENT OF",PITNEY BOWES,USB,1.0,1.0,1.0,USB,62
1,2014-01-29,2013-11-28,NON-IT GOODS,Informal Competitive,"CONSUMER AFFAIRS, DEPARTMENT OF",RODEA AUTO TECH,TIRE DISPOSAL,2.0,2.0,4.0,TIRE DISPOSAL,62
2,2013-11-01,2013-08-31,IT SERVICES,Informal Competitive,"CONSUMER AFFAIRS, DEPARTMENT OF","SMILE BUSINESS PRODUCTS, INC",LABOR,4.5,150.0,675.0,LABOR,62
4,2014-03-12,2014-03-12,NON-IT GOODS,Statewide Contract,"CORRECTIONS AND REHABILITATION, DEPARTMENT OF",TECHNOLOGY INTEGRATION GROUP,TONER,1.0,6080.26,6080.26,TONER,0
6,2014-10-10,2014-08-09,NON-IT GOODS,Statewide Contract,"CONSUMER AFFAIRS, DEPARTMENT OF",TECHNOLOGY INTEGRATION GROUP,HP 35A BLACK TONER,30.0,45.4,1362.0,HP 35A BLACK TONER,62


For the analysis, I'd like to address the following questions:

    (1) How long did it take on average for a PO to be entered into the system from the purchase date (Creation Date - 
        Purchase Date)? What was the distribution?
    (2) Who are the top 10 suppliers by total spend?
    (3) Is there a correlation between the total price of a purchase and the item's name/commodity?
    (4) How much did each department spend?
    (5) What is the total spend per year?
    (6) What is the supplier count development month over month?
    
These questions will be addressed using Power BI.

In [37]:
po_df.to_csv('CA_PO_Data Final.csv')