## Importing Data from SQLIte Database
---

In [18]:
import sqlite3
import pandas as pd
from IPython.display import display, Markdown

# Connect to SQLite database
conn    = sqlite3.connect('adzuna_jobs.db')

# Read a table into a dataframe
jobs_df = pd.read_sql_query("SELECT * FROM jobs", conn)

# close the connection
conn.close()

# Shape and Preview
display(Markdown('### Jobs Dataframe'))
display(jobs_df.head())
display(Markdown('---'))
display(Markdown('Descriptive Statistics'))
display(jobs_df.describe())
display(Markdown('---'))
display(Markdown('Missing Values'))
display(jobs_df.isnull().sum())
display(Markdown('---'))
display(Markdown('Data Types'))
display(jobs_df.dtypes)
display(Markdown('---'))
display(Markdown('Shape'))
display(jobs_df.shape)


### Jobs Dataframe

Unnamed: 0,id,title,company,area_list,location,category,salary_max,salary_min,salary_is_predicted,latitude,longitude,contract_type,contract_time,description,created,redirect_url
0,5310740788,Data Analyst,Cedar Recruitment,"[""UK"", ""London""]","London, UK",IT Jobs,50000.0,50000.0,0,,,permanent,,Data Analyst (Sports betting sector experience...,2025-07-18T22:35:04Z,https://www.adzuna.co.uk/jobs/land/ad/53107407...
1,5311386740,Data Analyst,Halian Technology Limited,"[""UK"", ""London"", ""East London"", ""Poplar""]","Poplar, East London",IT Jobs,117000.0,104000.0,0,51.504431,-0.014588,contract,,Halian Technology are currently recruiting a D...,2025-07-19T02:47:37Z,https://www.adzuna.co.uk/jobs/land/ad/53113867...
2,5297811155,Data Analyst,Bernecker's Nursery,"[""UK"", ""London"", ""East London"", ""Canning Town""]","Canning Town, East London",Part time Jobs,93600.0,93600.0,0,51.508202,0.035485,permanent,part_time,Description We are seeking a skilled Data Anal...,2025-07-11T16:04:33Z,https://www.adzuna.co.uk/jobs/details/52978111...
3,5307578659,Pension System Calculation and Data Analyst,Morson Talent,"[""UK"", ""London"", ""Central London"", ""Blackfriars""]","Blackfriars, Central London",Accounting & Finance Jobs,55000.0,55000.0,0,51.515704,-0.104021,permanent,,Pension System Calculation and Data Analyst Lo...,2025-07-17T02:46:22Z,https://www.adzuna.co.uk/jobs/land/ad/53075786...
4,5310220101,Data Analyst,Cedar,"[""UK"", ""London""]","London, UK",IT Jobs,50000.0,50000.0,0,,,permanent,full_time,Data Analyst (Sports betting sector experience...,2025-07-18T20:58:40Z,https://www.adzuna.co.uk/jobs/land/ad/53102201...


---

Descriptive Statistics

Unnamed: 0,salary_max,salary_min,latitude,longitude
count,1488.0,1489.0,894.0,894.0
mean,106764.647641,97518.290591,39.683289,-90.319594
std,60015.286834,48475.753153,4.938213,39.128809
min,26.0,0.0,32.7211,-122.749887
25%,61366.7625,59797.56,37.39406,-121.46649
50%,94770.535,89684.62,37.939426,-117.164709
75%,137237.8575,127690.94,40.755381,-73.978504
max,728000.0,312000.0,51.6521,0.0986


---

Missing Values

id                        0
title                     0
company                   9
area_list                 0
location                  0
category                  0
salary_max                1
salary_min                0
salary_is_predicted       0
latitude                595
longitude               595
contract_type          1272
contract_time           914
description               0
created                   0
redirect_url              0
dtype: int64

---

Data Types

id                      object
title                   object
company                 object
area_list               object
location                object
category                object
salary_max             float64
salary_min             float64
salary_is_predicted     object
latitude               float64
longitude              float64
contract_type           object
contract_time           object
description             object
created                 object
redirect_url            object
dtype: object

---

Shape

(1489, 16)

## Exploratory Data Analysis
1. Shape and preview.
2. Basic profiling.
3. Job posting trend over time.
4. Strip area to have a general area view.
5. Inspect description to acquire requirements and respnsibilities.
---

In [19]:
# Unique values
print(f'Number of locations: {jobs_df['company'].nunique()}')
print(f'\nNumber of categories: {jobs_df['category'].nunique()}, \n unique categories are: \n{jobs_df['category'].unique()}')
print(f'\nNumber of titles: {jobs_df['title'].nunique()}')
print(f'\nNumber of contract types: {jobs_df['contract_type'].nunique()}, \n unique categories are: \n{jobs_df['contract_type'].unique()}')
print(f'\nNumber of contract times: {jobs_df['contract_time'].nunique()}, \n unique categories are: \n{jobs_df['contract_time'].unique()}')

Number of locations: 791

Number of categories: 21, 
 unique categories are: 
['IT Jobs' 'Part time Jobs' 'Accounting & Finance Jobs'
 'PR, Advertising & Marketing Jobs' 'Graduate Jobs' 'Scientific & QA Jobs'
 'Admin Jobs' 'Trade & Construction Jobs' 'Consultancy Jobs' 'Retail Jobs'
 'HR & Recruitment Jobs' 'Legal Jobs' 'Energy, Oil & Gas Jobs'
 'Healthcare & Nursing Jobs' 'Maintenance Jobs' 'Manufacturing Jobs'
 'Teaching Jobs' 'Engineering Jobs' 'Customer Services Jobs'
 'Other/General Jobs' 'Sales Jobs']

Number of titles: 750

Number of contract types: 2, 
 unique categories are: 
['permanent' 'contract' None]

Number of contract times: 2, 
 unique categories are: 
[None 'part_time' 'full_time']
