# Data Collection – Apartment Prices in Kraków

## Goal
The goal of this notebook is to collect raw apartment listing data
that will be used in further stages of the data science project.

At this stage:
- no data cleaning is performed
- no analysis is done
- data is saved in raw form


## Data source
Planned data source:
- Otodom.pl (public real estate listings)

Important notes:
- Data collection is for educational purposes only
- No personal data will be collected
- Scraping logic will be implemented in a separate module


## Data collection logic

Scraping logic will be implemented in:
- src/scraping.py

This notebook will only call that logic and save results to:
- data/raw/


In [None]:
import pandas as pd

In [6]:
#Repository creation on Colab based on GitHub

from pathlib import Path

%cd /content

REPO_URL = "https://github.com/piotrevolta/krakow-apartment-price-prediction.git"
REPO_DIR = Path("/content/krakow-apartment-price-prediction")

if not REPO_DIR.exists():
    !git clone {REPO_URL}

%cd /content/krakow-apartment-price-prediction
!pwd
!ls


/content
/content/krakow-apartment-price-prediction
/content/krakow-apartment-price-prediction
 data			   notebooks   reports		  src
'Fields description.txt'   README.md   requirements.txt


In [22]:
#repository synchronization GitHub -> Colab
#!git pull --rebase

import importlib
import src.scraping as sc

importlib.reload(sc)
from src.scraping import collect_raw_listings, enrich_with_details

print("Imported from:", sc.__file__)


Imported from: /content/krakow-apartment-price-prediction/src/scraping.py


In [27]:
df_raw = collect_raw_listings(max_pages=1)
df_enriched = enrich_with_details(df_raw, max_details=30, sleep_s=1.0)

print("df_raw shape:", df_raw.shape)
print("df_enriched shape:", df_enriched.shape)

out_dir = Path("data/raw")
out_dir.mkdir(parents=True, exist_ok=True)

out_path = out_dir / "apartments_krakow_enriched.csv"
df_enriched.to_csv(out_path, index=False)

print("Saved:", out_path.resolve())
print("Files in data/raw:", [p.name for p in out_dir.glob("*")])



df_raw shape: (21, 9)
df_enriched shape: (21, 10)
Saved: /content/krakow-apartment-price-prediction/data/raw/apartments_krakow_enriched.csv
Files in data/raw: ['apartments_krakow_raw.csv', '.gitkeep', 'apartments_krakow_enriched.csv']


In [28]:

df_enriched.head(50)

Unnamed: 0,listing_url,address_text,address_street,address_subdistrict,address_district,address_city,address_voivodeship,price_text,price_per_m2_text,rooms_count_text
0,https://www.otodom.pl/pl/oferta/bezposrednio-3...,"os. Na Lotnisku, Nowe Bieńczyce, Bieńczyce, Kr...",os. Na Lotnisku,Nowe Bieńczyce,Bieńczyce,Kraków,małopolskie,599 000 zł,11 410 zł/m²,3
1,https://www.otodom.pl/pl/oferta/3-pokojowe-mie...,"ul. Josepha Conrada, Azory Zachód, Prądnik Bia...",ul. Josepha Conrada,Azory Zachód,Prądnik Biały,Kraków,małopolskie,901 511 zł,13 240 zł/m²,3
2,https://www.otodom.pl/pl/oferta/jasne-3niezale...,"ul. Kobierzyńska, Ruczaj, Dębniki, Kraków, mał...",ul. Kobierzyńska,Ruczaj,Dębniki,Kraków,małopolskie,795 000 zł,12 422 zł/m²,3
3,https://www.otodom.pl/pl/oferta/3-pokojowe-mie...,"ul. Dajwór, Kazimierz, Stare Miasto, Kraków, m...",ul. Dajwór,Kazimierz,Stare Miasto,Kraków,małopolskie,2 875 537 zł,41 250 zł/m²,3
4,https://www.otodom.pl/pl/oferta/2-pokoje-remon...,"Stare Bieńczyce, Bieńczyce, Kraków, małopolskie",,Stare Bieńczyce,Bieńczyce,Kraków,małopolskie,399 000 zł,10 711 zł/m²,2
5,https://www.otodom.pl/pl/oferta/ekskluzywna-ni...,"ul. Piaskowa, Bronowice Wielkie, Prądnik Biały...",ul. Piaskowa,Bronowice Wielkie,Prądnik Biały,Kraków,małopolskie,1 349 000 zł,20 754 zł/m²,3
6,https://www.otodom.pl/pl/oferta/wola-justowska...,"ul. Królowej Jadwigi, Wola Justowska, Zwierzyn...",ul. Królowej Jadwigi,Wola Justowska,Zwierzyniec,Kraków,małopolskie,725 000 zł,13 551 zł/m²,2
7,https://www.otodom.pl/pl/oferta/dwupoziomowy-a...,"ul. Władysława Łokietka, Tonie, Prądnik Biały,...",ul. Władysława Łokietka,Tonie,Prądnik Biały,Kraków,małopolskie,1 600 000 zł,10 159 zł/m²,6
8,https://www.otodom.pl/pl/oferta/unikatowy-tara...,"ul. Górka Narodowa, Górka Narodowa, Prądnik Bi...",ul. Górka Narodowa,Górka Narodowa,Prądnik Biały,Kraków,małopolskie,765 000 zł,14 088 zł/m²,3
9,https://www.otodom.pl/pl/oferta/mieszkanie-2-p...,"os. Dywizjonu 303 303, Czyżyny Lotnisko, Czyży...",os. Dywizjonu 303 303,Czyżyny Lotnisko,Czyżyny,Kraków,małopolskie,739 000 zł,14 519 zł/m²,2
