# Managing Dataframe

Please use the provided dataset to carry out the following tasks 


## Common Use Cases

- Remove blank or unknown supplier name
- Perform groupby tender number to get total awarded amount per tender number
- Perform checks to determine if the amount above a certain threshold
- Perform text analytics on the tender description to filter rows where the procurement matches certain key words or meanings.

## Import the necessary packages and libraries

In [31]:
import pandas as pd
import numpy as np

## Read in the data file

In [32]:
df = pd.read_csv('GovernmentProcurement.csv')

df.head(10)

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt,extra
0,ACR000ETT18300010,"SUPPLY, DESIGN, DEVELOPMENT, CUSTOMIZATION, DE...",Accounting And Corporate Regulatory Authority,11/6/2019,Awarded to Suppliers,AZAAS PTE. LTD.,$2305880.000,
1,ACR000ETT18300011,"APPLICATION ENHANCEMENT, CUSTOMISATION, MIGRAT...",Accounting And Corporate Regulatory Authority,10/5/2019,Awarded to No Suppliers,Unknown,$0.000,
2,ACR000ETT19300001,PROVISION OF CONSULTANCY SERVICES FOR STRATEGI...,Accounting And Corporate Regulatory Authority,30/4/2019,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$2035000.000,
3,ACR000ETT19300002,"SUPPLY, DELIVERY, DESIGN, CUSTOMISATION, INSTA...",Accounting And Corporate Regulatory Authority,29/8/2019,Awarded to Suppliers,TECH MAHINDRA LIMITED (SINGAPORE BRANCH),$30700373.870,
4,ACR000ETT19300003,PROVISION OF MEDIA MONITORING SERVICES FOR A P...,Accounting And Corporate Regulatory Authority,6/8/2019,Awarded to Suppliers,INSIGHTMATRIX,$178800.000,
5,ACR000ETT19300004,INVITATION TO TENDER FOR THE PROVISION OF CALL...,Accounting And Corporate Regulatory Authority,5/11/2019,Awarded to Suppliers,TDCX (SG) PTE. LTD.,$8566830.840,
6,ACR000ETT20300002,INVITATION TO TENDER FOR THE PROVISION OF SERV...,Accounting And Corporate Regulatory Authority,10/11/2020,Awarded by Items,DELOITTE & TOUCHE ENTERPRISE RISK SERVICES PTE...,$285000.000,
7,ACR000ETT20300002,INVITATION TO TENDER FOR THE PROVISION OF SERV...,Accounting And Corporate Regulatory Authority,10/11/2020,Awarded by Items,KPMG SERVICES PTE. LTD.,$90000.000,
8,ACR000ETT20300003,PROVISION OF AN IT SECURITY CONTROLS AND OPERA...,Accounting And Corporate Regulatory Authority,9/12/2020,Awarded to Suppliers,ERNST & YOUNG ADVISORY PTE. LTD.,$182400.000,
9,ACR000ETT20300004,"CONCEPTUALIZATION, DESIGN, BUILD, SET-UP OF NE...",Accounting And Corporate Regulatory Authority,9/3/2021,Awarded to Suppliers,D' PERCEPTION SINGAPORE PTE. LTD.,$3071056.400,


## Check the number of rows and columns in the dataset

In [33]:
df_shape = df.shape
print(f'Rows and columns in one csv file is {df_shape}')

Rows and columns in one csv file is (18640, 8)


## Remove a column

In [34]:
df = df.drop('extra', axis=1)

In [35]:
df

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
0,ACR000ETT18300010,"SUPPLY, DESIGN, DEVELOPMENT, CUSTOMIZATION, DE...",Accounting And Corporate Regulatory Authority,11/6/2019,Awarded to Suppliers,AZAAS PTE. LTD.,$2305880.000
1,ACR000ETT18300011,"APPLICATION ENHANCEMENT, CUSTOMISATION, MIGRAT...",Accounting And Corporate Regulatory Authority,10/5/2019,Awarded to No Suppliers,Unknown,$0.000
2,ACR000ETT19300001,PROVISION OF CONSULTANCY SERVICES FOR STRATEGI...,Accounting And Corporate Regulatory Authority,30/4/2019,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$2035000.000
3,ACR000ETT19300002,"SUPPLY, DELIVERY, DESIGN, CUSTOMISATION, INSTA...",Accounting And Corporate Regulatory Authority,29/8/2019,Awarded to Suppliers,TECH MAHINDRA LIMITED (SINGAPORE BRANCH),$30700373.870
4,ACR000ETT19300003,PROVISION OF MEDIA MONITORING SERVICES FOR A P...,Accounting And Corporate Regulatory Authority,6/8/2019,Awarded to Suppliers,INSIGHTMATRIX,$178800.000
...,...,...,...,...,...,...,...
18635,YRS000ETT23000002,[T03/2023] Provision of event management servi...,Yellow Ribbon Singapore,9/6/2023,Awarded to Suppliers,ADVENTURER'S SINGAPORE PTE LTD,$912605.000
18636,YRS000ETT23000003,[T02/2023] For the set-up of logistics facilit...,Yellow Ribbon Singapore,15/8/2023,Awarded by Items,FIRECONTROL TECH PTE. LTD.,$72400.000
18637,YRS000ETT23000003,[T02/2023] For the set-up of logistics facilit...,Yellow Ribbon Singapore,15/8/2023,Awarded by Items,KUNYI (SINGAPORE) PTE LTD,$648000.000
18638,YRS000ETT23000004,[T04/2023] Provision of legal services for gen...,Yellow Ribbon Singapore,19/9/2023,Awarded to Suppliers,CENTRAL CHAMBERS LAW CORPORATION,$184800.000


## List the numbers of rows with null values

In Pandas, null values represent missing or unknown data. They can appear as NaN (Not a Number) or None.

In [36]:
# look for missing values
df.isnull().any()

tender_no                True
tender_description      False
agency                   True
award_date              False
tender_detail_status    False
supplier_name            True
awarded_amt             False
dtype: bool

In [37]:
# Calculate null values per column
df.isnull().sum()

tender_no               1
tender_description      0
agency                  1
award_date              0
tender_detail_status    0
supplier_name           2
awarded_amt             0
dtype: int64

In [38]:
# Calculate null values per column; axis= 1 for across row
df.isnull().any(axis=1).sum() 

2

## Remove rows with null values

In [39]:
# Remove rows with NaN values
# make a new copy explicitly
df_cleaned = df.copy().dropna()
df_cleaned

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
0,ACR000ETT18300010,"SUPPLY, DESIGN, DEVELOPMENT, CUSTOMIZATION, DE...",Accounting And Corporate Regulatory Authority,11/6/2019,Awarded to Suppliers,AZAAS PTE. LTD.,$2305880.000
1,ACR000ETT18300011,"APPLICATION ENHANCEMENT, CUSTOMISATION, MIGRAT...",Accounting And Corporate Regulatory Authority,10/5/2019,Awarded to No Suppliers,Unknown,$0.000
2,ACR000ETT19300001,PROVISION OF CONSULTANCY SERVICES FOR STRATEGI...,Accounting And Corporate Regulatory Authority,30/4/2019,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$2035000.000
3,ACR000ETT19300002,"SUPPLY, DELIVERY, DESIGN, CUSTOMISATION, INSTA...",Accounting And Corporate Regulatory Authority,29/8/2019,Awarded to Suppliers,TECH MAHINDRA LIMITED (SINGAPORE BRANCH),$30700373.870
4,ACR000ETT19300003,PROVISION OF MEDIA MONITORING SERVICES FOR A P...,Accounting And Corporate Regulatory Authority,6/8/2019,Awarded to Suppliers,INSIGHTMATRIX,$178800.000
...,...,...,...,...,...,...,...
18635,YRS000ETT23000002,[T03/2023] Provision of event management servi...,Yellow Ribbon Singapore,9/6/2023,Awarded to Suppliers,ADVENTURER'S SINGAPORE PTE LTD,$912605.000
18636,YRS000ETT23000003,[T02/2023] For the set-up of logistics facilit...,Yellow Ribbon Singapore,15/8/2023,Awarded by Items,FIRECONTROL TECH PTE. LTD.,$72400.000
18637,YRS000ETT23000003,[T02/2023] For the set-up of logistics facilit...,Yellow Ribbon Singapore,15/8/2023,Awarded by Items,KUNYI (SINGAPORE) PTE LTD,$648000.000
18638,YRS000ETT23000004,[T04/2023] Provision of legal services for gen...,Yellow Ribbon Singapore,19/9/2023,Awarded to Suppliers,CENTRAL CHAMBERS LAW CORPORATION,$184800.000


## Explore the datatypes of each column

In [40]:
df_cleaned.dtypes

tender_no                object
tender_description       object
agency                   object
award_date               object
tender_detail_status     object
supplier_name            object
awarded_amt             float64
dtype: object

## Changing the printed precision for numbers

In [41]:
# Prevent scientific notation and display 6 decimal places
pd.options.display.float_format = '${:.3f}'.format 

In [42]:
df_cleaned

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
0,ACR000ETT18300010,"SUPPLY, DESIGN, DEVELOPMENT, CUSTOMIZATION, DE...",Accounting And Corporate Regulatory Authority,11/6/2019,Awarded to Suppliers,AZAAS PTE. LTD.,$2305880.000
1,ACR000ETT18300011,"APPLICATION ENHANCEMENT, CUSTOMISATION, MIGRAT...",Accounting And Corporate Regulatory Authority,10/5/2019,Awarded to No Suppliers,Unknown,$0.000
2,ACR000ETT19300001,PROVISION OF CONSULTANCY SERVICES FOR STRATEGI...,Accounting And Corporate Regulatory Authority,30/4/2019,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$2035000.000
3,ACR000ETT19300002,"SUPPLY, DELIVERY, DESIGN, CUSTOMISATION, INSTA...",Accounting And Corporate Regulatory Authority,29/8/2019,Awarded to Suppliers,TECH MAHINDRA LIMITED (SINGAPORE BRANCH),$30700373.870
4,ACR000ETT19300003,PROVISION OF MEDIA MONITORING SERVICES FOR A P...,Accounting And Corporate Regulatory Authority,6/8/2019,Awarded to Suppliers,INSIGHTMATRIX,$178800.000
...,...,...,...,...,...,...,...
18635,YRS000ETT23000002,[T03/2023] Provision of event management servi...,Yellow Ribbon Singapore,9/6/2023,Awarded to Suppliers,ADVENTURER'S SINGAPORE PTE LTD,$912605.000
18636,YRS000ETT23000003,[T02/2023] For the set-up of logistics facilit...,Yellow Ribbon Singapore,15/8/2023,Awarded by Items,FIRECONTROL TECH PTE. LTD.,$72400.000
18637,YRS000ETT23000003,[T02/2023] For the set-up of logistics facilit...,Yellow Ribbon Singapore,15/8/2023,Awarded by Items,KUNYI (SINGAPORE) PTE LTD,$648000.000
18638,YRS000ETT23000004,[T04/2023] Provision of legal services for gen...,Yellow Ribbon Singapore,19/9/2023,Awarded to Suppliers,CENTRAL CHAMBERS LAW CORPORATION,$184800.000


## Changing date format

|Code | Meaning |
|---|---|
|%Y|Year (four digits)|
|%m|Month (01-12)|
|%d|Day of month (01-31)|
|%A|Weekday (full name)|
|%B|Month (full name)|
|%H|Hour (24-hour clock, 00-23)|
|%M|Minute (00-59)|
|%S|Second (00-59)|
|%p|AM/PM designation|


Note: Use errors='coerce' so that when a string cannot be converted (e.g., due to an invalid format), Pandas will insert a NaT (Not a Time) value instead of raising an error.

In [43]:
df_cleaned.loc[:,'award_date'] = pd.to_datetime(df['award_date'], format='%d/%m/%Y', errors='coerce')

In [44]:
df_cleaned.loc[:,'award_date'] = df_cleaned['award_date'].apply(lambda x: x.strftime('%A, %B %d, %Y'))

In [45]:
df_cleaned

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
0,ACR000ETT18300010,"SUPPLY, DESIGN, DEVELOPMENT, CUSTOMIZATION, DE...",Accounting And Corporate Regulatory Authority,"Tuesday, June 11, 2019",Awarded to Suppliers,AZAAS PTE. LTD.,$2305880.000
1,ACR000ETT18300011,"APPLICATION ENHANCEMENT, CUSTOMISATION, MIGRAT...",Accounting And Corporate Regulatory Authority,"Friday, May 10, 2019",Awarded to No Suppliers,Unknown,$0.000
2,ACR000ETT19300001,PROVISION OF CONSULTANCY SERVICES FOR STRATEGI...,Accounting And Corporate Regulatory Authority,"Tuesday, April 30, 2019",Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$2035000.000
3,ACR000ETT19300002,"SUPPLY, DELIVERY, DESIGN, CUSTOMISATION, INSTA...",Accounting And Corporate Regulatory Authority,"Thursday, August 29, 2019",Awarded to Suppliers,TECH MAHINDRA LIMITED (SINGAPORE BRANCH),$30700373.870
4,ACR000ETT19300003,PROVISION OF MEDIA MONITORING SERVICES FOR A P...,Accounting And Corporate Regulatory Authority,"Tuesday, August 06, 2019",Awarded to Suppliers,INSIGHTMATRIX,$178800.000
...,...,...,...,...,...,...,...
18635,YRS000ETT23000002,[T03/2023] Provision of event management servi...,Yellow Ribbon Singapore,"Friday, June 09, 2023",Awarded to Suppliers,ADVENTURER'S SINGAPORE PTE LTD,$912605.000
18636,YRS000ETT23000003,[T02/2023] For the set-up of logistics facilit...,Yellow Ribbon Singapore,"Tuesday, August 15, 2023",Awarded by Items,FIRECONTROL TECH PTE. LTD.,$72400.000
18637,YRS000ETT23000003,[T02/2023] For the set-up of logistics facilit...,Yellow Ribbon Singapore,"Tuesday, August 15, 2023",Awarded by Items,KUNYI (SINGAPORE) PTE LTD,$648000.000
18638,YRS000ETT23000004,[T04/2023] Provision of legal services for gen...,Yellow Ribbon Singapore,"Tuesday, September 19, 2023",Awarded to Suppliers,CENTRAL CHAMBERS LAW CORPORATION,$184800.000


## Is there duplicated rows?

In [46]:
# Check if any duplicates exist in the entire DataFrame
isAnyRowDuplicated = df.duplicated().any()
isAnyRowDuplicated

False

## List the unique values in each column

In [47]:
# Get unique values for each column
unique_values = {col: df[col].unique() for col in df}

print("Unique values in each column:\n")
for col, values in unique_values.items():
    print(f"{col}: {values}")

Unique values in each column:

tender_no: ['ACR000ETT18300010' 'ACR000ETT18300011' 'ACR000ETT19300001' ...
 'YRS000ETT23000003' 'YRS000ETT23000004' 'YRS000ETT23000005']
tender_description: ['SUPPLY, DESIGN, DEVELOPMENT, CUSTOMIZATION, DELIVERY, INSTALLATION, TESTING, COMMISSIONING AND MAINTENANCE OF A VARIABLE CAPITAL COMPANIES (VCC) SYSTEM FOR ACCOUNTING AND CORPORATE REGULARTORY AUTHORITY'
 'APPLICATION ENHANCEMENT, CUSTOMISATION, MIGRATION, DELIVERY, INSTALLATION, TESTING AND COMMISSIONING OF THE FULLY OPERATIONAL BIZFINx 2.0 APPLICATION SYSTEM WITH AN OPTION FOR MAINTENANCE'
 "PROVISION OF CONSULTANCY SERVICES FOR STRATEGIC BUSINESS PROCESSES RE-ENGINEERING (SBPR) AND ACRA'S IT INFRASTRUCTURE"
 ...
 '[T02/2023] For the set-up of logistics facilities training classrooms and laboratories at A2 Level 5 and B5 Level 5 of Changi Prison Complex'
 '[T04/2023] Provision of legal services for general retainer arrangement for a period of 3 years with an option to extend up to 2 years'
 'Enga

## Determine range of awarded amount

In [48]:
print("The range of awarded amount is :" , min(df_cleaned.awarded_amt), " to ", max(df_cleaned.awarded_amt))


The range of awarded amount is : 0.0  to  1493179167.0


## Determine range of date

In [49]:
df_cleaned['award_date'] = pd.to_datetime(df_cleaned['award_date'])

In [50]:
df_cleaned.award_date.dtype

dtype('<M8[ns]')

In [51]:
print("The range of awarded date is :" , min(df_cleaned.award_date.dt.date), " to ", max(df_cleaned.award_date.dt.date))

The range of awarded date is : 2019-04-01  to  2024-03-31


## Filtering data

### Using logical operators to filter rows

#### Finding tenders above $1 million dollars

In [56]:
df_cleaned

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
0,ACR000ETT18300010,"SUPPLY, DESIGN, DEVELOPMENT, CUSTOMIZATION, DE...",Accounting And Corporate Regulatory Authority,2019-06-11,Awarded to Suppliers,AZAAS PTE. LTD.,$2305880.000
1,ACR000ETT18300011,"APPLICATION ENHANCEMENT, CUSTOMISATION, MIGRAT...",Accounting And Corporate Regulatory Authority,2019-05-10,Awarded to No Suppliers,Unknown,$0.000
2,ACR000ETT19300001,PROVISION OF CONSULTANCY SERVICES FOR STRATEGI...,Accounting And Corporate Regulatory Authority,2019-04-30,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$2035000.000
3,ACR000ETT19300002,"SUPPLY, DELIVERY, DESIGN, CUSTOMISATION, INSTA...",Accounting And Corporate Regulatory Authority,2019-08-29,Awarded to Suppliers,TECH MAHINDRA LIMITED (SINGAPORE BRANCH),$30700373.870
4,ACR000ETT19300003,PROVISION OF MEDIA MONITORING SERVICES FOR A P...,Accounting And Corporate Regulatory Authority,2019-08-06,Awarded to Suppliers,INSIGHTMATRIX,$178800.000
...,...,...,...,...,...,...,...
18635,YRS000ETT23000002,[T03/2023] Provision of event management servi...,Yellow Ribbon Singapore,2023-06-09,Awarded to Suppliers,ADVENTURER'S SINGAPORE PTE LTD,$912605.000
18636,YRS000ETT23000003,[T02/2023] For the set-up of logistics facilit...,Yellow Ribbon Singapore,2023-08-15,Awarded by Items,FIRECONTROL TECH PTE. LTD.,$72400.000
18637,YRS000ETT23000003,[T02/2023] For the set-up of logistics facilit...,Yellow Ribbon Singapore,2023-08-15,Awarded by Items,KUNYI (SINGAPORE) PTE LTD,$648000.000
18638,YRS000ETT23000004,[T04/2023] Provision of legal services for gen...,Yellow Ribbon Singapore,2023-09-19,Awarded to Suppliers,CENTRAL CHAMBERS LAW CORPORATION,$184800.000


In [52]:
df_cleaned.isnull().any()

tender_no               False
tender_description      False
agency                  False
award_date              False
tender_detail_status    False
supplier_name           False
awarded_amt             False
dtype: bool

In [53]:
df_cleaned[df_cleaned.awarded_amt >= 1000000]

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
0,ACR000ETT18300010,"SUPPLY, DESIGN, DEVELOPMENT, CUSTOMIZATION, DE...",Accounting And Corporate Regulatory Authority,2019-06-11,Awarded to Suppliers,AZAAS PTE. LTD.,$2305880.000
2,ACR000ETT19300001,PROVISION OF CONSULTANCY SERVICES FOR STRATEGI...,Accounting And Corporate Regulatory Authority,2019-04-30,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$2035000.000
3,ACR000ETT19300002,"SUPPLY, DELIVERY, DESIGN, CUSTOMISATION, INSTA...",Accounting And Corporate Regulatory Authority,2019-08-29,Awarded to Suppliers,TECH MAHINDRA LIMITED (SINGAPORE BRANCH),$30700373.870
5,ACR000ETT19300004,INVITATION TO TENDER FOR THE PROVISION OF CALL...,Accounting And Corporate Regulatory Authority,2019-11-05,Awarded to Suppliers,TDCX (SG) PTE. LTD.,$8566830.840
9,ACR000ETT20300004,"CONCEPTUALIZATION, DESIGN, BUILD, SET-UP OF NE...",Accounting And Corporate Regulatory Authority,2021-03-09,Awarded to Suppliers,D' PERCEPTION SINGAPORE PTE. LTD.,$3071056.400
...,...,...,...,...,...,...,...
18615,WSG000ETT23000003,Consultancy Services to Perform an Internation...,Workforce Singapore,2023-07-25,Awarded to Suppliers,"MCKINSEY & COMPANY SINGAPORE, PTE LTD",$1763000.000
18626,YRS000ETT21000006,Provision of training for Logistics and Produc...,Yellow Ribbon Singapore,2022-04-05,Awarded to Suppliers,SSA ACADEMY PTE. LTD.,$3086640.000
18628,YRS000ETT22000001,"Provision of training in Employability Skills,...",Yellow Ribbon Singapore,2022-07-04,Awarded by Items,SSA ACADEMY PTE. LTD.,$1169150.000
18631,YRS000ETT22000005,Provision of Application Project Management an...,Yellow Ribbon Singapore,2023-03-15,Awarded to Suppliers,TRIMINDTECH SOLUTIONS PTE. LTD.,$1111000.000


### Nlargest or Nsmallest

If we are not filtering based on a threshold, we can use ```nlargest``` or ```nsmallest``` to view the n largest or n smallest rows in a column.

In [57]:
df_cleaned.nlargest(5, "awarded_amt")

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
11016,NEA000ETT19300029,Proposed Erection of Integrated Waste Manageme...,National Environment Agency,2020-04-14,Awarded to Suppliers,KEPPEL SEGHERS ENGINEERING SINGAPORE PTE. LTD.,$1493179167.000
8397,LTA000ETT21000084,Design and Construction of Riviera Interchange...,Land Transport Authority,2022-09-30,Awarded to Suppliers,TAISEI CORPORATION,$1098195596.000
8078,LTA000ETT19300138,Design and Construction of Changi East Depot,Land Transport Authority,2021-05-28,Awarded to Suppliers,CHINA JINGYE ENGINEERING CORPORATION LIMITED (...,$1050500000.000
8140,LTA000ETT19300194,Bus Contracting - Bulim and Sembawang-Yishun B...,Land Transport Authority,2020-09-30,Awarded to Suppliers,TOWER TRANSIT SINGAPORE PTE. LTD.,$1025102939.000
11415,NEA000ETT23000064,Tender for Integrated Public Cleaning for the ...,National Environment Agency,2024-01-11,Awarded to Suppliers,CHYE THIAM MAINTENANCE PTE LTD,$999999999.000


In [63]:
df_cleaned.nsmallest(5, "awarded_amt")

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
1,ACR000ETT18300011,"APPLICATION ENHANCEMENT, CUSTOMISATION, MIGRAT...",Accounting And Corporate Regulatory Authority,2019-05-10,Awarded to No Suppliers,Unknown,$0.000
67,AGC000ETT23000006,Provision of Agile Application Development and...,Attorney-General's Chambers,2023-12-19,Awarded to No Suppliers,Unknown,$0.000
75,AGO000ETT22000004,TENDER FOR ENGAGEMENT OF AUDIT SERVICES TO CAR...,Auditor-General's Office,2022-07-12,Awarded to No Suppliers,Unknown,$0.000
111,BCA000ETT20300004,PROPOSED ESTATE UPGRADING PROGRAMME (EUP) BATC...,Building and Construction Authority,2020-06-19,Awarded to No Suppliers,Unknown,$0.000
112,BCA000ETT20300005,PROPOSED ESTATE UPGRADING PROGRAMME (EUP) BATC...,Building and Construction Authority,2020-06-19,Awarded to No Suppliers,Unknown,$0.000


### Filtering by dates

#### Finding tenders between 2021 and 2022

```datetime``` is part of Python's standard library while Timestamp is a Pandas datatype built on top of NumPy's datetime64 type. Timestamp is designed to be more efficient for storing and usage within dataframes.

In [54]:
start_date = pd.Timestamp('2021-01-01')
end_date = pd.Timestamp('2022-12-31')

df_filtered = df_cleaned[(df_cleaned['award_date'] >= start_date) & (df_cleaned['award_date']<= end_date)]
df_filtered

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
9,ACR000ETT20300004,"CONCEPTUALIZATION, DESIGN, BUILD, SET-UP OF NE...",Accounting And Corporate Regulatory Authority,2021-03-09,Awarded to Suppliers,D' PERCEPTION SINGAPORE PTE. LTD.,$3071056.400
10,ACR000ETT21000001,"DESIGN, DEVELOPMENT, CUSTOMIZATION, DELIVERY, ...",Accounting And Corporate Regulatory Authority,2021-09-06,Awarded to Suppliers,ALPHA ZETTA PTE. LTD.,$2321600.000
11,ACR000ETT21000003,"SUPPLY, DELIVERY, INSTALLATION, TESTING, COMMI...",Accounting And Corporate Regulatory Authority,2022-04-14,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$108555723.000
13,ACR000ETT21000004,INVITATION TO TENDER FOR THE APPLICATION SOFTW...,Accounting And Corporate Regulatory Authority,2022-04-13,Awarded to Suppliers,CRIMSONLOGIC PTE LTD,$4503450.000
14,ACR000ETT22000001,FOR PROVISION OF THE APPLICATION SOFTWARE MAIN...,Accounting And Corporate Regulatory Authority,2022-06-28,Awarded to Suppliers,FPT ASIA PACIFIC PTE. LTD.,$539600.000
...,...,...,...,...,...,...,...
18628,YRS000ETT22000001,"Provision of training in Employability Skills,...",Yellow Ribbon Singapore,2022-07-04,Awarded by Items,SSA ACADEMY PTE. LTD.,$1169150.000
18629,YRS000ETT22000003,Provision of training in WSQ Operate Forklift ...,Yellow Ribbon Singapore,2022-08-25,Awarded to No Suppliers,Unknown,$0.000
18630,YRS000ETT22000004,"Provision of event management services, suppor...",Yellow Ribbon Singapore,2022-06-24,Awarded to Suppliers,ADVENTURER'S SINGAPORE PTE LTD,$546855.000
18632,YRS000ETT22000006,Provision of training for barista and food ser...,Yellow Ribbon Singapore,2022-12-08,Awarded by Items,BETTR BARISTA PTE. LTD.,$286000.000


### Filtering by column names

The ``` filter ``` function is useful for getting a smaller sized data set from a larger one based on the questions asked. 

For example, you can use filter to get a one-column dataframe

In [29]:
df_cleaned.filter(["supplier_name"])

Unnamed: 0,supplier_name
0,AZAAS PTE. LTD.
1,Unknown
2,ACCENTURE SG SERVICES PTE. LTD.
3,TECH MAHINDRA LIMITED (SINGAPORE BRANCH)
4,INSIGHTMATRIX
...,...
18635,ADVENTURER'S SINGAPORE PTE LTD
18636,FIRECONTROL TECH PTE. LTD.
18637,KUNYI (SINGAPORE) PTE LTD
18638,CENTRAL CHAMBERS LAW CORPORATION


#### Filtering multiples columns in preferred order/sequence.

In addition, you can use filter to get a set of columns in a particular sequence within the dataframe.

In [60]:
df_sm = df_cleaned.filter(["award_date", "supplier_name","awarded_amt"])
df_sm

Unnamed: 0,award_date,supplier_name,awarded_amt
0,2019-06-11,AZAAS PTE. LTD.,$2305880.000
1,2019-05-10,Unknown,$0.000
2,2019-04-30,ACCENTURE SG SERVICES PTE. LTD.,$2035000.000
3,2019-08-29,TECH MAHINDRA LIMITED (SINGAPORE BRANCH),$30700373.870
4,2019-08-06,INSIGHTMATRIX,$178800.000
...,...,...,...
18635,2023-06-09,ADVENTURER'S SINGAPORE PTE LTD,$912605.000
18636,2023-08-15,FIRECONTROL TECH PTE. LTD.,$72400.000
18637,2023-08-15,KUNYI (SINGAPORE) PTE LTD,$648000.000
18638,2023-09-19,CENTRAL CHAMBERS LAW CORPORATION,$184800.000


### Filtering columns by strings (like)

Using ```like='award'```, we are able to do a case-insensitive filter on the column name


In [62]:
award_columns = df_sm.filter(like='award')
award_columns

Unnamed: 0,award_date,awarded_amt
0,2019-06-11,$2305880.000
1,2019-05-10,$0.000
2,2019-04-30,$2035000.000
3,2019-08-29,$30700373.870
4,2019-08-06,$178800.000
...,...,...
18635,2023-06-09,$912605.000
18636,2023-08-15,$72400.000
18637,2023-08-15,$648000.000
18638,2023-09-19,$184800.000


### Filtering columns by strings (regex)

Using ```regex='Award'```, we are able to do a case-sensitive filter on the column name.

To prepare for the demonstration, we add in a column named "Award date column".

In [67]:
award_columns['Award date column'] = award_columns.iloc[:,:1]
award_columns

Unnamed: 0,award_date,awarded_amt,Award date column
0,2019-06-11,$2305880.000,2019-06-11
1,2019-05-10,$0.000,2019-05-10
2,2019-04-30,$2035000.000,2019-04-30
3,2019-08-29,$30700373.870,2019-08-29
4,2019-08-06,$178800.000,2019-08-06
...,...,...,...
18635,2023-06-09,$912605.000,2023-06-09
18636,2023-08-15,$72400.000,2023-08-15
18637,2023-08-15,$648000.000,2023-08-15
18638,2023-09-19,$184800.000,2023-09-19


In [70]:
award_columns.filter(regex='Award')

Unnamed: 0,Award date column
0,2019-06-11
1,2019-05-10
2,2019-04-30
3,2019-08-29
4,2019-08-06
...,...
18635,2023-06-09
18636,2023-08-15
18637,2023-08-15
18638,2023-09-19


### Filtering rows based on a given list of values

We can filter the dataframe for rows with a value that is found in a list using ```isin```

In [74]:
shortlist = ['ACCENTURE SG SERVICES PTE. LTD.','ASSURANCE PARTNERS LLP']
df_filtered[df_filtered.supplier_name.isin(shortlist)]

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
11,ACR000ETT21000003,"SUPPLY, DELIVERY, INSTALLATION, TESTING, COMMI...",Accounting And Corporate Regulatory Authority,2022-04-14,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$108555723.000
392,CCS000ETT22000002,"Development, and delivery of a fully operation...",Competition and Consumer Commission of Singapo...,2022-11-03,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$451030.000
437,CCY000ETT21000005,PROVISION FOR A TWO-YEAR FRAMEWORK AGREEMENT (...,"Ministry of Culture, Community and Youth - Min...",2021-06-05,Awarded by Items,ASSURANCE PARTNERS LLP,$1127100.000
707,CDVHQ0ETT22000001,Invitation to Tender for the Provision of Audi...,Ministry of Social and Family Development - Mi...,2022-04-07,Awarded by Items,ASSURANCE PARTNERS LLP,$1273752.000
917,CPF000ETT21000010,Supply of IT Services for System Implementatio...,Central Provident Fund Board,2021-08-03,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$23260.000
969,CPF000ETT22000012,Implementation and Maintenance of Unified Data...,Central Provident Fund Board,2022-11-30,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$11022100.000
1973,FINAGDETT20300002,Invitation to Tender for Application Maintenan...,Ministry of Finance-Accountant-General's Depar...,2021-07-14,Awarded by Items,ACCENTURE SG SERVICES PTE. LTD.,$5153898.010
3255,GVT000ETT20300035,FOR THE PROVISION OF DATA SCIENCE SOFTWARE AND...,Government Technology Agency (GovTech),2021-04-30,Awarded by Items,ACCENTURE SG SERVICES PTE. LTD.,$1.000
3265,GVT000ETT20300037,FOR THE PROVISION OF THE IMPLEMENTATION AND OP...,Government Technology Agency (GovTech),2021-01-12,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$28569312.000
3324,GVT000ETT21000025,SUPPLEMENTARY INVITATION TO TENDER FOR AGILE C...,Government Technology Agency (GovTech),2022-03-16,Awarded by Items,ACCENTURE SG SERVICES PTE. LTD.,$8738880.000


### Filtering rows based on string values within rows

We can use the ```str``` accessor to filter rows, for example, for values that starts with "STB" or containing a set of characters. 

In [77]:
df_filtered[df_filtered["tender_no"].str.startswith("STB")]

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
17846,STB000ETT20300039,Tender to Appoint a Research Consultant for Pa...,Singapore Tourism Board,2021-03-18,Awarded to Suppliers,"Grail Insights, Inc",$270100.000
17848,STB000ETT20300041,PUBLIC RELATIONS AND DIGITAL MARKETING CONSULT...,Singapore Tourism Board,2021-02-02,Awarded to Suppliers,Ruder Finn (Shanghai) Marketing & Communicatio...,$2780887.960
17849,STB000ETT20300042,TENDER FOR THE PROVISION OF GUARANTEED ENERGY ...,Singapore Tourism Board,2021-03-03,Awarded to Suppliers,AIRELATED SERVICES PTE. LTD.,$1688333.000
17850,STB000ETT20300043,Request for Proposal for Induced Expenditure S...,Singapore Tourism Board,2021-01-13,Awarded to Suppliers,STARHUB LTD.,$215000.000
17851,STB000ETT20300044,Tender for Warehousing and Freight Management ...,Singapore Tourism Board,2021-04-08,Awarded to Suppliers,ST ENGINEERING SYNTHESIS PTE. LTD.,$399093.000
...,...,...,...,...,...,...,...
17971,STB000ETT22000032,TENDER FOR PROVISION OF INTEGRATED FACILITIES ...,Singapore Tourism Board,2022-12-20,Awarded to Suppliers,EXCELTEC PROPERTY MANAGEMENT PTE LTD,$13529260.800
17972,STB000ETT22000033,Tender for Leasing of Grid-Tied Solar Photovol...,Singapore Tourism Board,2022-11-17,Awarded to Suppliers,ENGIE SOUTH EAST ASIA PTE. LTD.,$700943.160
17973,STB000ETT22000034,Tender for Provision of Comprehensive Access f...,Singapore Tourism Board,2022-10-11,Awarded to Suppliers,INTERNATIONAL SOS PTE LTD,$89880.000
17974,STB000ETT22000035,CONSULTANCY SERVICES TO PROJECT MANAGE STB'S A...,Singapore Tourism Board,2022-12-15,Awarded to Suppliers,TURNER & TOWNSEND PTE. LIMITED,$968000.000


In [78]:
df_filtered[df_filtered["tender_description"].str.contains("food")]

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
11271,NEA000ETT21000096,Supply and delivery of fish food,National Environment Agency,2022-07-18,Awarded by Items,ALTECH INTEGRATED PTE. LTD.,$842580.000
11272,NEA000ETT21000096,Supply and delivery of fish food,National Environment Agency,2022-07-18,Awarded by Items,REIN BIOTECH SERVICES PTE LTD,$28910.000
16621,SFA000ETT20300045,Tender for provision of food and beverage samp...,Singapore Food Agency,2021-03-02,Awarded to Suppliers,SETSCO SERVICES PTE LTD,$334800.000
16645,SFA000ETT21000017,Tender for Four-year period contract for the s...,Singapore Food Agency,2021-11-23,Awarded by Items,AGILENT TECHNOLOGIES SINGAPORE (SALES) PTE. LTD.,$109846.800
16646,SFA000ETT21000017,Tender for Four-year period contract for the s...,Singapore Food Agency,2021-11-23,Awarded by Items,CHARSLTON TECHNOLOGIES PTE LTD,$9882.200
16647,SFA000ETT21000017,Tender for Four-year period contract for the s...,Singapore Food Agency,2021-11-23,Awarded by Items,FISHER SCIENTIFIC PTE LTD,$20451.750
16648,SFA000ETT21000017,Tender for Four-year period contract for the s...,Singapore Food Agency,2021-11-23,Awarded by Items,Phenomenex INC,$50189.300
16649,SFA000ETT21000017,Tender for Four-year period contract for the s...,Singapore Food Agency,2021-11-23,Awarded by Items,WATERS PACIFIC PTE. LTD.,$29463.530
16691,SFA000ETT21000050,Tender for the Study of food waste from local ...,Singapore Food Agency,2022-06-06,Awarded to Suppliers,Metabolic BV,$520330.000
16705,SFA000ETT22000002,Tender for enforcement services to carry out s...,Singapore Food Agency,2022-06-28,Awarded to Suppliers,CERTIS CISCO AUXILIARY POLICE FORCE PTE. LTD.,$22661424.000


## Practice: How can we filter rows that contain the word "application" and "software" in the tender description?

In [82]:
df_filtered[df_filtered["tender_description"].str.contains("application") & df_filtered["tender_description"].str.contains("software") ]

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
5418,HPB000ETT21000006,CIOOT33/20 - Provision of application software...,Health Promotion Board,2021-06-02,Awarded by Items,EY DIGITAL PTE. LTD.,$959298.000
5419,HPB000ETT21000006,CIOOT33/20 - Provision of application software...,Health Promotion Board,2021-06-02,Awarded by Items,TRIBAL WORLDWIDE PTE. LTD.,$494730.000
13111,NST000ETT21000142,Supply of data visualisation software for deve...,"Agency for Science, Technology and Research",2021-12-30,Awarded to Suppliers,Plotly Technologies Inc.,$229210.550
16018,RPO000ETT21000006,Provision of the application software maintena...,Republic Polytechnic,2021-07-01,Awarded to Suppliers,AVEPOINT SINGAPORE PTE. LTD.,$4902490.200


### Make the filtering of rows based on string values case-insensitive

In [86]:
df_filtered[df_filtered["tender_description"].str.startswith("t")]

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt


In [88]:
df_filtered[df_filtered["tender_description"].str.match(r'^t', case=False)]

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
74,AGO000ETT22000002,To Procure Adobe Acrobat Pro (AAP) Software Li...,Auditor-General's Office,2022-06-06,Awarded to Suppliers,JK TECHNOLOGY PTE LTD,$295698.000
75,AGO000ETT22000004,TENDER FOR ENGAGEMENT OF AUDIT SERVICES TO CAR...,Auditor-General's Office,2022-07-12,Awarded to No Suppliers,Unknown,$0.000
149,BCA000ETT21000002,TERM CONTRACT FOR THE MAINTENANCE OF MECHANICA...,Building and Construction Authority,2021-05-12,Awarded to Suppliers,SAVILLS PROPERTY MANAGEMENT PTE. LTD.,$2161002.000
155,BCA000ETT21000009,TENDER FOR THE PROVISION OF CONSULTANCY SERVIC...,Building and Construction Authority,2021-08-06,Awarded to Suppliers,ERNST & YOUNG ADVISORY PTE. LTD.,$888000.000
156,BCA000ETT21000010,TERM CONTRACT FOR EVALUATION OF ESSENTIAL CONS...,Building and Construction Authority,2021-06-07,Awarded by Items,ELEMENT CONSTRUCTION TESTING (S) PTE. LTD.,$600.000
...,...,...,...,...,...,...,...
18530,URA000ETT22000003,Tender For Appointment Of Legal Panel For Prov...,Urban Redevelopment Authority,2022-05-06,Awarded by Items,LEE & LEE,$3800.000
18531,URA000ETT22000003,Tender For Appointment Of Legal Panel For Prov...,Urban Redevelopment Authority,2022-05-06,Awarded by Items,WONGPARTNERSHIP LLP,$4170.000
18532,URA000ETT22000004,Term Contract For The Provision Of Landscape M...,Urban Redevelopment Authority,2022-04-29,Awarded to Suppliers,PRINCE'S LANDSCAPE PTE. LTD.,$216000.000
18534,URA000ETT22000009,Tender For Renewal Of Existing Autodesk Licenses,Urban Redevelopment Authority,2022-04-11,Awarded to Suppliers,INNOCOM TECHNOLOGIES PTE LTD,$223600.000


### Self-exploration

What does the tilde(```~```) in the follow code do?

In [90]:
df_filtered[~df_filtered["tender_description"].str.contains("application", case=False)]

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
9,ACR000ETT20300004,"CONCEPTUALIZATION, DESIGN, BUILD, SET-UP OF NE...",Accounting And Corporate Regulatory Authority,2021-03-09,Awarded to Suppliers,D' PERCEPTION SINGAPORE PTE. LTD.,$3071056.400
10,ACR000ETT21000001,"DESIGN, DEVELOPMENT, CUSTOMIZATION, DELIVERY, ...",Accounting And Corporate Regulatory Authority,2021-09-06,Awarded to Suppliers,ALPHA ZETTA PTE. LTD.,$2321600.000
11,ACR000ETT21000003,"SUPPLY, DELIVERY, INSTALLATION, TESTING, COMMI...",Accounting And Corporate Regulatory Authority,2022-04-14,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$108555723.000
38,AGC000ETT21000002,INVITATION TO TENDER FOR IT CHANGE MANAGEMENT ...,Attorney-General's Chambers,2021-07-14,Awarded by Items,WSH EXPERTS PTE. LTD.,$589920.000
39,AGC000ETT21000002,INVITATION TO TENDER FOR IT CHANGE MANAGEMENT ...,Attorney-General's Chambers,2021-07-14,Awarded by Items,ZENITH INFOTECH (S) PTE LTD.,$2078040.000
...,...,...,...,...,...,...,...
18628,YRS000ETT22000001,"Provision of training in Employability Skills,...",Yellow Ribbon Singapore,2022-07-04,Awarded by Items,SSA ACADEMY PTE. LTD.,$1169150.000
18629,YRS000ETT22000003,Provision of training in WSQ Operate Forklift ...,Yellow Ribbon Singapore,2022-08-25,Awarded to No Suppliers,Unknown,$0.000
18630,YRS000ETT22000004,"Provision of event management services, suppor...",Yellow Ribbon Singapore,2022-06-24,Awarded to Suppliers,ADVENTURER'S SINGAPORE PTE LTD,$546855.000
18632,YRS000ETT22000006,Provision of training for barista and food ser...,Yellow Ribbon Singapore,2022-12-08,Awarded by Items,BETTR BARISTA PTE. LTD.,$286000.000


## Query

The ```query()``` function is useful for phrasing questions that uses comparison operators such as "equal to" and "less than".  It allows the conditions for filtering to be passed as a string.

In [94]:
df_filtered.query('supplier_name == "SSA ACADEMY PTE. LTD." and awarded_amt  >= 1000000')


Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
18626,YRS000ETT21000006,Provision of training for Logistics and Produc...,Yellow Ribbon Singapore,2022-04-05,Awarded to Suppliers,SSA ACADEMY PTE. LTD.,$3086640.000
18628,YRS000ETT22000001,"Provision of training in Employability Skills,...",Yellow Ribbon Singapore,2022-07-04,Awarded by Items,SSA ACADEMY PTE. LTD.,$1169150.000


## Practice: How can we query for tenders put out by  "Attorney-General's Chambers" in 2021?

In [95]:
df_filtered.query('agency == "Attorney-General\'s Chambers" and award_date.dt.year  == 2021')

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
38,AGC000ETT21000002,INVITATION TO TENDER FOR IT CHANGE MANAGEMENT ...,Attorney-General's Chambers,2021-07-14,Awarded by Items,WSH EXPERTS PTE. LTD.,$589920.000
39,AGC000ETT21000002,INVITATION TO TENDER FOR IT CHANGE MANAGEMENT ...,Attorney-General's Chambers,2021-07-14,Awarded by Items,ZENITH INFOTECH (S) PTE LTD.,$2078040.000
40,AGC000ETT21000003,"INVITATION TO TENDER FOR DESIGN, SUPPLY, DELIV...",Attorney-General's Chambers,2021-11-01,Awarded to Suppliers,NEC ASIA PACIFIC PTE. LTD.,$5020611.000
41,AGC000ETT21000004,"INVITATION TO TENDER FOR THE DEVELOPMENT, MAIN...",Attorney-General's Chambers,2021-11-28,Awarded to Suppliers,NEC ASIA PACIFIC PTE. LTD.,$2616961.000
42,AGC000ETT21000006,"INVITATION TO TENDER FOR THE SUPPLY, DELIVERY,...",Attorney-General's Chambers,2021-11-24,Awarded to Suppliers,JK TECHNOLOGY PTE LTD,$370440.000


## Practice:

Now that we are familiar with how to find rows with a specific value, try to find rows with "unknown"  and remove these rows.

In [116]:
df_filtered

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
9,ACR000ETT20300004,"CONCEPTUALIZATION, DESIGN, BUILD, SET-UP OF NE...",Accounting And Corporate Regulatory Authority,2021-03-09,Awarded to Suppliers,D' PERCEPTION SINGAPORE PTE. LTD.,$3071056.400
10,ACR000ETT21000001,"DESIGN, DEVELOPMENT, CUSTOMIZATION, DELIVERY, ...",Accounting And Corporate Regulatory Authority,2021-09-06,Awarded to Suppliers,ALPHA ZETTA PTE. LTD.,$2321600.000
11,ACR000ETT21000003,"SUPPLY, DELIVERY, INSTALLATION, TESTING, COMMI...",Accounting And Corporate Regulatory Authority,2022-04-14,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$108555723.000
13,ACR000ETT21000004,INVITATION TO TENDER FOR THE APPLICATION SOFTW...,Accounting And Corporate Regulatory Authority,2022-04-13,Awarded to Suppliers,CRIMSONLOGIC PTE LTD,$4503450.000
14,ACR000ETT22000001,FOR PROVISION OF THE APPLICATION SOFTWARE MAIN...,Accounting And Corporate Regulatory Authority,2022-06-28,Awarded to Suppliers,FPT ASIA PACIFIC PTE. LTD.,$539600.000
...,...,...,...,...,...,...,...
18628,YRS000ETT22000001,"Provision of training in Employability Skills,...",Yellow Ribbon Singapore,2022-07-04,Awarded by Items,SSA ACADEMY PTE. LTD.,$1169150.000
18629,YRS000ETT22000003,Provision of training in WSQ Operate Forklift ...,Yellow Ribbon Singapore,2022-08-25,Awarded to No Suppliers,Unknown,$0.000
18630,YRS000ETT22000004,"Provision of event management services, suppor...",Yellow Ribbon Singapore,2022-06-24,Awarded to Suppliers,ADVENTURER'S SINGAPORE PTE LTD,$546855.000
18632,YRS000ETT22000006,Provision of training for barista and food ser...,Yellow Ribbon Singapore,2022-12-08,Awarded by Items,BETTR BARISTA PTE. LTD.,$286000.000


In [125]:
df_ans = df_filtered[~df_filtered['supplier_name'].str.contains("Unknown", case=False)]
df_ans

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
9,ACR000ETT20300004,"CONCEPTUALIZATION, DESIGN, BUILD, SET-UP OF NE...",Accounting And Corporate Regulatory Authority,2021-03-09,Awarded to Suppliers,D' PERCEPTION SINGAPORE PTE. LTD.,$3071056.400
10,ACR000ETT21000001,"DESIGN, DEVELOPMENT, CUSTOMIZATION, DELIVERY, ...",Accounting And Corporate Regulatory Authority,2021-09-06,Awarded to Suppliers,ALPHA ZETTA PTE. LTD.,$2321600.000
11,ACR000ETT21000003,"SUPPLY, DELIVERY, INSTALLATION, TESTING, COMMI...",Accounting And Corporate Regulatory Authority,2022-04-14,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$108555723.000
13,ACR000ETT21000004,INVITATION TO TENDER FOR THE APPLICATION SOFTW...,Accounting And Corporate Regulatory Authority,2022-04-13,Awarded to Suppliers,CRIMSONLOGIC PTE LTD,$4503450.000
14,ACR000ETT22000001,FOR PROVISION OF THE APPLICATION SOFTWARE MAIN...,Accounting And Corporate Regulatory Authority,2022-06-28,Awarded to Suppliers,FPT ASIA PACIFIC PTE. LTD.,$539600.000
...,...,...,...,...,...,...,...
18627,YRS000ETT22000001,"Provision of training in Employability Skills,...",Yellow Ribbon Singapore,2022-07-04,Awarded by Items,NTUC LEARNINGHUB PTE. LTD.,$72000.000
18628,YRS000ETT22000001,"Provision of training in Employability Skills,...",Yellow Ribbon Singapore,2022-07-04,Awarded by Items,SSA ACADEMY PTE. LTD.,$1169150.000
18630,YRS000ETT22000004,"Provision of event management services, suppor...",Yellow Ribbon Singapore,2022-06-24,Awarded to Suppliers,ADVENTURER'S SINGAPORE PTE LTD,$546855.000
18632,YRS000ETT22000006,Provision of training for barista and food ser...,Yellow Ribbon Singapore,2022-12-08,Awarded by Items,BETTR BARISTA PTE. LTD.,$286000.000


We can also retreive the index of the rows we wish to remove and apply the ```drop``` function

In [126]:
# Filter rows based on conditions
mask = (df_ans['supplier_name'] == "D' PERCEPTION SINGAPORE PTE. LTD.") | (df_ans['agency'] == 'Yellow Ribbon Singapore')

# Drop the filtered rows using .loc
df_drop1 = df_ans.loc[~mask].copy()
df_drop1

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
10,ACR000ETT21000001,"DESIGN, DEVELOPMENT, CUSTOMIZATION, DELIVERY, ...",Accounting And Corporate Regulatory Authority,2021-09-06,Awarded to Suppliers,ALPHA ZETTA PTE. LTD.,$2321600.000
11,ACR000ETT21000003,"SUPPLY, DELIVERY, INSTALLATION, TESTING, COMMI...",Accounting And Corporate Regulatory Authority,2022-04-14,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$108555723.000
13,ACR000ETT21000004,INVITATION TO TENDER FOR THE APPLICATION SOFTW...,Accounting And Corporate Regulatory Authority,2022-04-13,Awarded to Suppliers,CRIMSONLOGIC PTE LTD,$4503450.000
14,ACR000ETT22000001,FOR PROVISION OF THE APPLICATION SOFTWARE MAIN...,Accounting And Corporate Regulatory Authority,2022-06-28,Awarded to Suppliers,FPT ASIA PACIFIC PTE. LTD.,$539600.000
38,AGC000ETT21000002,INVITATION TO TENDER FOR IT CHANGE MANAGEMENT ...,Attorney-General's Chambers,2021-07-14,Awarded by Items,WSH EXPERTS PTE. LTD.,$589920.000
...,...,...,...,...,...,...,...
18604,WSG000ETT22000006,Provision of digital services to support the d...,Workforce Singapore,2022-08-04,Awarded to Suppliers,SEEMECV PTE. LTD.,$692280.000
18605,WSG000ETT22000007,Integrated Services for Digital Network (ISDN)...,Workforce Singapore,2022-09-12,Awarded to Suppliers,SINGAPORE TELECOMMUNICATIONS LIMITED,$81840.040
18606,WSG000ETT22000008,Provision of Services to conduct mystery audit,Workforce Singapore,2022-08-08,Awarded to Suppliers,AADVANTAGE CONSULTING GROUP PTE. LTD.,$109600.000
18607,WSG000ETT22000009,INVITATION TO TENDER FOR THE PROVISION OF INTE...,Workforce Singapore,2022-10-25,Awarded to Suppliers,SEEMECV PTE. LTD.,$434920.000


In [127]:
indexAge = df_ans[(df_ans['supplier_name'] == 'D\' PERCEPTION SINGAPORE PTE. LTD.') | (df_ans['agency'] == 'Yellow Ribbon Singapore')].index
df_drop2 = df_ans.drop(indexAge)
df_drop2

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
10,ACR000ETT21000001,"DESIGN, DEVELOPMENT, CUSTOMIZATION, DELIVERY, ...",Accounting And Corporate Regulatory Authority,2021-09-06,Awarded to Suppliers,ALPHA ZETTA PTE. LTD.,$2321600.000
11,ACR000ETT21000003,"SUPPLY, DELIVERY, INSTALLATION, TESTING, COMMI...",Accounting And Corporate Regulatory Authority,2022-04-14,Awarded to Suppliers,ACCENTURE SG SERVICES PTE. LTD.,$108555723.000
13,ACR000ETT21000004,INVITATION TO TENDER FOR THE APPLICATION SOFTW...,Accounting And Corporate Regulatory Authority,2022-04-13,Awarded to Suppliers,CRIMSONLOGIC PTE LTD,$4503450.000
14,ACR000ETT22000001,FOR PROVISION OF THE APPLICATION SOFTWARE MAIN...,Accounting And Corporate Regulatory Authority,2022-06-28,Awarded to Suppliers,FPT ASIA PACIFIC PTE. LTD.,$539600.000
38,AGC000ETT21000002,INVITATION TO TENDER FOR IT CHANGE MANAGEMENT ...,Attorney-General's Chambers,2021-07-14,Awarded by Items,WSH EXPERTS PTE. LTD.,$589920.000
...,...,...,...,...,...,...,...
18604,WSG000ETT22000006,Provision of digital services to support the d...,Workforce Singapore,2022-08-04,Awarded to Suppliers,SEEMECV PTE. LTD.,$692280.000
18605,WSG000ETT22000007,Integrated Services for Digital Network (ISDN)...,Workforce Singapore,2022-09-12,Awarded to Suppliers,SINGAPORE TELECOMMUNICATIONS LIMITED,$81840.040
18606,WSG000ETT22000008,Provision of Services to conduct mystery audit,Workforce Singapore,2022-08-08,Awarded to Suppliers,AADVANTAGE CONSULTING GROUP PTE. LTD.,$109600.000
18607,WSG000ETT22000009,INVITATION TO TENDER FOR THE PROVISION OF INTE...,Workforce Singapore,2022-10-25,Awarded to Suppliers,SEEMECV PTE. LTD.,$434920.000


## Groupby

``` groupby ``` is a split-apply-combine operation. It helps to group data in the Dataframe which can lead to answering quantitative questions you may have on the dataset.

### What is the number of tender awarded to each supplier?

**Pass the column name that the grouping should be done on before** specifying the column to perform the count.

In [27]:
number_by_supplier = df_cleaned.groupby("supplier_name")["tender_no"].count()
number_by_supplier

supplier_name
*SCAPE CO., LTD.              1
01 COMPUTER SYSTEM PTE LTD    9
1 BISHAN MEDICAL PTE. LTD.    2
1 K PRODUCTIONS PTE. LTD.     1
1 SUPPLIER PTE. LTD.          1
                             ..
memsstar Limited              1
metaphacts GmbH               1
muholis andriyatni            1
pilot44                       1
young kyoung ahn              1
Name: tender_no, Length: 6230, dtype: int64

### Which are the tenders awarded to "NEC ASIA PACIFIC PTE. LTD."?

In [96]:
number_by_nec = df_cleaned.groupby("supplier_name").get_group("NEC ASIA PACIFIC PTE. LTD.")
number_by_nec

Unnamed: 0,tender_no,tender_description,agency,award_date,tender_detail_status,supplier_name,awarded_amt
40,AGC000ETT21000003,"INVITATION TO TENDER FOR DESIGN, SUPPLY, DELIV...",Attorney-General's Chambers,2021-11-01,Awarded to Suppliers,NEC ASIA PACIFIC PTE. LTD.,$5020611.000
41,AGC000ETT21000004,"INVITATION TO TENDER FOR THE DEVELOPMENT, MAIN...",Attorney-General's Chambers,2021-11-28,Awarded to Suppliers,NEC ASIA PACIFIC PTE. LTD.,$2616961.000
172,BCA000ETT21000025,Invitation To Tender for the Provision of Upgr...,Building and Construction Authority,2022-01-24,Awarded to Suppliers,NEC ASIA PACIFIC PTE. LTD.,$1738000.000
855,CPF000ETT19300027,Implementation and Maintenance of User and Ent...,Central Provident Fund Board,2019-12-12,Awarded to Suppliers,NEC ASIA PACIFIC PTE. LTD.,$1654344.000
1040,CRA000ETT20300004,Invitation to Tender for the Provision of Agil...,Gambling Regulatory Authority of Singapore (GRA),2021-01-19,Awarded to Suppliers,NEC ASIA PACIFIC PTE. LTD.,$7500550.000
1763,EMA000ETT22000010,PROVISION OF AGILE APPLICATION DEVELOPMENT AND...,Energy Market Authority of Singapore,2022-11-10,Awarded to Suppliers,NEC ASIA PACIFIC PTE. LTD.,$11649765.010
2051,FINHQ0ETT19300003,"OUTCOME-BASED PROCUREMENT FOR THE SUPPLY, DELI...",Ministry of Finance-Ministry Headquarter,2019-11-18,Awarded to Suppliers,NEC ASIA PACIFIC PTE. LTD.,$5329739.950
3155,GVT000ETT19300031,FOR AGILE CO-DEVELOPMENT AND ICT PROFESSIONAL ...,Government Technology Agency (GovTech),2020-05-29,Awarded by Items,NEC ASIA PACIFIC PTE. LTD.,$1.000
3374,GVT000ETT22000004,PROVISION OF NETWORK EQUIPMENT AND CABLING INF...,Government Technology Agency (GovTech),2022-08-05,Awarded by Items,NEC ASIA PACIFIC PTE. LTD.,$1.000
3432,GVT000ETT22000034,"FOR THE SUPPLY, DELIVERY, INSTALLATION, WARRAN...",Government Technology Agency (GovTech),2023-10-25,Awarded by Items,NEC ASIA PACIFIC PTE. LTD.,$1.000


### What is the total sum of tenders awarded to "NEC ASIA PACIFIC PTE. LTD."?


In [99]:
df_cleaned.groupby("supplier_name").get_group("NEC ASIA PACIFIC PTE. LTD.").awarded_amt.sum()

102601792.52

## Practice: What is the total sum of tenders awarded to "NEC ASIA PACIFIC PTE. LTD." by "Singapore Food Agency"?

In [101]:
# Method 1
df_cleaned.groupby("supplier_name").get_group("NEC ASIA PACIFIC PTE. LTD.").loc[df_cleaned["agency"] == "Singapore Food Agency"] \
    ["awarded_amt"].sum()

2854468.0

In [102]:
# Method 2 (More efficient and flexible)
df_cleaned.groupby(["supplier_name","agency"])["awarded_amt"].sum()

supplier_name               agency                                     
*SCAPE CO., LTD.            Ministry of Finance - Vital                       $6320.000
01 COMPUTER SYSTEM PTE LTD  Building and Construction Authority             $139631.400
                            Institute of Technical Education                   $755.000
                            Ministry of Education                            $16590.960
                            Ministry of Finance - Vital                   $10044019.000
                                                                               ...     
memsstar Limited            Agency for Science, Technology and Research     $896448.000
metaphacts GmbH             National Library Board                         $3532178.570
muholis andriyatni          Judiciary-State Courts                           $12000.000
pilot44                     Enterprise Singapore                            $333333.320
young kyoung ahn            Singapore Tourism Bo

In [103]:
df_cleaned.groupby(["supplier_name","agency"])["awarded_amt"].sum().loc["NEC ASIA PACIFIC PTE. LTD.", "Singapore Food Agency"]

2854468.0

## Practice: Group by more subsets - Include the year as an additional subset

We can have more subsets using ```groupby```.  To achieve this, we can instead year as subset.

In [107]:
df_cleaned.groupby(["supplier_name","agency", df_cleaned["award_date"].dt.year])["awarded_amt"].sum()

supplier_name               agency                                       award_date
*SCAPE CO., LTD.            Ministry of Finance - Vital                  2023            $6320.000
01 COMPUTER SYSTEM PTE LTD  Building and Construction Authority          2020          $139631.400
                            Institute of Technical Education             2022             $755.000
                            Ministry of Education                        2021           $16590.960
                            Ministry of Finance - Vital                  2020         $4593099.960
                                                                                          ...     
memsstar Limited            Agency for Science, Technology and Research  2022          $896448.000
metaphacts GmbH             National Library Board                       2021         $3532178.570
muholis andriyatni          Judiciary-State Courts                       2021           $12000.000
pilot44                  

In [115]:
df_cleaned.groupby(["supplier_name","agency", df_cleaned["award_date"].dt.year]).agg({'awarded_amt': 'sum', 
    'tender_no': 'count'})

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,awarded_amt,tender_no
supplier_name,agency,award_date,Unnamed: 3_level_1,Unnamed: 4_level_1
"*SCAPE CO., LTD.",Ministry of Finance - Vital,2023,$6320.000,1
01 COMPUTER SYSTEM PTE LTD,Building and Construction Authority,2020,$139631.400,1
01 COMPUTER SYSTEM PTE LTD,Institute of Technical Education,2022,$755.000,1
01 COMPUTER SYSTEM PTE LTD,Ministry of Education,2021,$16590.960,1
01 COMPUTER SYSTEM PTE LTD,Ministry of Finance - Vital,2020,$4593099.960,1
...,...,...,...,...
memsstar Limited,"Agency for Science, Technology and Research",2022,$896448.000,1
metaphacts GmbH,National Library Board,2021,$3532178.570,1
muholis andriyatni,Judiciary-State Courts,2021,$12000.000,1
pilot44,Enterprise Singapore,2020,$333333.320,1


In [112]:
df_cleaned.groupby(["supplier_name","agency", df_cleaned["award_date"].dt.year]).agg({'awarded_amt': 'sum', 
    'tender_no': 'count'}).loc["NEC ASIA PACIFIC PTE. LTD."]

Unnamed: 0_level_0,Unnamed: 1_level_0,awarded_amt,tender_no
agency,award_date,Unnamed: 2_level_1,Unnamed: 3_level_1
"Agency for Science, Technology and Research",2023,$150000.000,1
Attorney-General's Chambers,2021,$7637572.000,2
Building and Construction Authority,2022,$1738000.000,1
Central Provident Fund Board,2019,$1654344.000,1
Energy Market Authority of Singapore,2022,$11649765.010,1
Gambling Regulatory Authority of Singapore (GRA),2021,$7500550.000,1
Government Technology Agency (GovTech),2020,$1.000,1
Government Technology Agency (GovTech),2022,$1.000,1
Government Technology Agency (GovTech),2023,$1.000,1
Health Promotion Board,2022,$1.000,1
