# Crawling and Analysing The Cambridge Computer Crime Database (CCCD)

In [7]:
cccd_link = "https://www.cl.cam.ac.uk/~ah793/cccd.html"
cccd_googlespreadsheet_link = "https://docs.google.com/spreadsheets/d/e/2PACX-1vRNQRD4_BMaRUoav-G54B6Nh1MAZlhvH3vqg5RnOdWRT_bG7WVIxH52KsP4kj71SOGwHRLyRQwoRpWi/pubhtml;headers=true"

In [58]:
import cfscrape
from lxml import etree
import pandas as pd
pd.set_option('display.max_colwidth', -1)

In [11]:
scraper = cfscrape.create_scraper()
scraped_html=scraper.get(cccd_googlespreadsheet_link).content
tables = pd.read_html(scraped_html) # Returns list of all tables on page

### Pre-processing the output of the crawler

In [40]:
df = tables[0]
df.columns = df.iloc[0]
df = df.drop([1], axis=1)
df = df.drop([0])

### Showing the first 2 lines

In [76]:
df.head(2)

Unnamed: 0,Date of alleged offence (or range),Age at time of alleged offence (estimated using other known data),Age at time of arrest or most recent court appearance,Date of arrest or most recent court appearance,Gender of alleged offender,Alleged co-offender(s),Overview of alleged offence type,Sentence,Alleged financial gain (confirmed - not estimated),Alleged other costs,Source/s
1,July 2017,58,60,27/02/2019 sentenced,Male,.,"A former senior local government officer employed at the Nuneaton and Bedworth District Council, he was prosecuted for passing the personal information of rival job applicants to his partner, who had applied for an administrative role. He accessed the authorityâs recruitment system and emailed the personal information of the nine rival shortlisted candidates to both his own work email address and also his partnerâs email account. The recruitment packs he shared included the name, address, telephone number and CV of each candidate, along with contact details for each of their two referees. He admitted a charge of unlawfully sharing data in breach of s55 of the Data Protection Act 1998.",Fined Â£660 and ordered to pay Â£713.75 costs and a victim surcharge of Â£66,.,.,https://perma.cc/CKH6-Y29C
2,.,.,39,25/02/2019 sentenced,Male,Group of 4 arrested; other alleged co-offenders based in India,"He was identified as being part of a scam that involved victims being contacted by someone claiming to be from Microsoft's IT department. Victims were told there was a problem with their computer and that they could resolve the issue for a fee. Instead of fixing the problem, malware was installed on the computer that would allow them to steal even more money. While the fraudsters were based in the Delhi region of India but they were asking victims to transfer money to a UK based bank account. He was identified as the owner of the account. He was transferring the cash back to Delhi, while taking a cut for himself.",28 month sentence,Believed to have retained tens of thousands of pounds for himself,"Helped steal Â£400,000 from victims (as a group)",https://perma.cc/DH6B-5TQM https://perma.cc/AWZ2-R7GD


### Investigating cases related to booter(s), or dos, or ddos attacks


In [81]:
booter_strings = ["denial of service",
                  "distributed denial of service",
                  "ddos", 
                  "booter",
                  "stresser",
                  "as a service","as-a-service"
                 ]

In [82]:
for word in booter_strings:
    print(word, ':', len(df[df['Overview of alleged offence type'].str.contains(word, case=False)]))
#     display(df[df['Overview of alleged offence type'].str.contains(word, case=False)])

denial of service : 46
distributed denial of service : 7
ddos : 10
booter : 27
stresser : 23
as a service : 1
as-a-service : 0


### Only Booter and Stresser

In [91]:
only_booter_n_stresser = df[df['Overview of alleged offence type'].str.contains('|'.join(["booter","stresser"]), case=False)]
print("Entries:", len(only_booter_n_stresser))
only_booter_n_stresser['Overview of alleged offence type']

Entries: 28


18     He was originally charged with nine offences under the Computer Misuse Act, two of blackmail, one of possession of criminal property, and one charge that he endangered human welfare with a cyber-attack against Lonestar MTN, Liberiaâs biggest internet provider. The accusations that he attempted to blackmail Lloyds and Barclays banks after committing denial of service attacks were subsequently dropped by the prosecution. He pleaded guilty two offences against the Computer Misuse Act and one charge of possessing criminal property, which related to the denial of service attacks against Lonestar MTN. He had been working for a rival Liberian network provider when he conducted denial of service attacks against Lonestar MTN, first using rented botnets and booter services, and then his own Mirai botnet. Lonestar Cell MTN revealed that it was experiencing unprecedented and repeated denial of service attacks, which disabled internet access across Liberia. He had been arrested in the UK a

### Union of lines with at least one of the words in the list

In [92]:
len(df[df['Overview of alleged offence type'].str.contains('|'.join(booter_strings), case=False)])

51

In [93]:
df[df['Overview of alleged offence type'].str.contains('|'.join(booter_strings), case=False)]['Overview of alleged offence type']

18     He was originally charged with nine offences under the Computer Misuse Act, two of blackmail, one of possession of criminal property, and one charge that he endangered human welfare with a cyber-attack against Lonestar MTN, Liberiaâs biggest internet provider. The accusations that he attempted to blackmail Lloyds and Barclays banks after committing denial of service attacks were subsequently dropped by the prosecution. He pleaded guilty two offences against the Computer Misuse Act and one charge of possessing criminal property, which related to the denial of service attacks against Lonestar MTN. He had been working for a rival Liberian network provider when he conducted denial of service attacks against Lonestar MTN, first using rented botnets and booter services, and then his own Mirai botnet. Lonestar Cell MTN revealed that it was experiencing unprecedented and repeated denial of service attacks, which disabled internet access across Liberia. He had been arrested in the UK a