# Předběžné ceny tepla 2020


Energetický regulační úřad pravidelně zveřejňuje [Přehled cen tepelné energie v členění podle cenových lokalit](https://www.eru.cz/teplo/statistika/prehled-cen-tepelne-energie-v-cleneni-podle-cenovych-lokalit) ve formátu pdf.

Za rok 2020 jsou k dispozici předběžné údaje pro ceny a dodávky v jednotlivých lokalitách.

In [1]:
import camelot
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Get pdf from the web and use Camelot to retrieve tables
pdf = 'http://www.eru.cz/documents/10540/462926/Predbezne_ceny_tepla_2020+%E2%80%93%20web.pdf'
tables = camelot.read_pdf(pdf, pages='1-end')
tables

<TableList n=50>

In [2]:
# Prepare header and labels
header = ['lokalita', 'kraj', 'pod_uhli', 'pod_plyn', 'pod_bio', 'pod_olej',\
          'pod_ost', 'vykon', 'nad_10_czk', 'nad_10_gj',\
          'pod_10_czk','pod_10_gj','cptv_czk', 'cptv_gj',\
          'prim_rozv_czk', 'prim_rozv_gj','cvs_czk', 'cvs_gj',\
          'cvs_voda_czk', 'cvs_voda_gj','blok_kot_czk', 'blok_kot_gj',\
          'sek_rozv_czk', 'sek_rozv_gj',\
          'dps_czk', 'dps_gj', 'dom_kot_czk', 'dom_kot_gj']
podily = ['pod_uhli', 'pod_plyn', 'pod_bio', 'pod_olej', 'pod_ost']
labels = ['uhlí', 'zemní plyn', 'biomasa', 'topný olej', 'jiné']

# Merge all tables into one DataFrame
ceny = []
for i in range(tables.n):
    ceny.extend(tables[i].data[2:])
ceny = pd.DataFrame(ceny, columns=header)

# Clean strings (decimal, thousand), repair null and convert to numeric types
ceny.loc[:, 'pod_uhli':] = ceny.loc[:, 'pod_uhli':].apply(lambda x: x.str.replace(',', '.'))
ceny.loc[:, 'pod_uhli':] = ceny.loc[:, 'pod_uhli':].apply(lambda x: x.str.replace(' ', ''))
ceny.loc[:, 'pod_uhli':] = ceny.loc[:, 'pod_uhli':].apply(pd.to_numeric)
ceny = ceny.fillna(0)
# ceny['lokalita'] = ceny['lokalita'].apply(lambda x: x.replace('\0', 'ti')) #There is a null value 

In [3]:
# Agregate energy and price from various supply options
gj = ceny.filter(regex='_gj+', axis=1).columns
czk = ceny.filter(regex='_czk+', axis=1).columns
ceny['dod_cena'] = np.average(ceny[czk], weights=ceny[gj], axis=1)
ceny['dod_mnozstvi'] = np.sum(ceny[gj], axis=1)

ceny.dod_mnozstvi.sum()
ceny.dod_mnozstvi.sum() == ceny.filter(regex='_gj+', axis=1).sum().sum()

True

In [4]:
np.average(ceny.filter(regex='_czk+', axis=1), weights=ceny.filter(regex='_gj+', axis=1))

438.4192261912718

In [5]:
np.average(ceny['dod_cena'], weights=ceny['dod_mnozstvi'])

438.4192261912718

In [6]:
ceny.to_csv('eru_ceny_tepla_2020.csv', index=False)
ceny

Unnamed: 0,lokalita,kraj,pod_uhli,pod_plyn,pod_bio,pod_olej,pod_ost,vykon,nad_10_czk,nad_10_gj,...,blok_kot_czk,blok_kot_gj,sek_rozv_czk,sek_rozv_gj,dps_czk,dps_gj,dom_kot_czk,dom_kot_gj,dod_cena,dod_mnozstvi
0,Abertamy - Hornická 468,K,0.0,100.0,0.0,0.0,0.0,0.132,0.0,0.0,...,0.0,0.0,0.0,0.0,0.00,0.0,600.38,736.0,600.38,736.0
1,Adamov,B,0.0,100.0,0.0,0.0,0.0,9.000,0.0,0.0,...,0.0,0.0,0.0,0.0,547.25,27000.0,0.00,0.0,547.25,27000.0
2,Adamov,B,0.0,100.0,0.0,0.0,0.0,2.203,0.0,0.0,...,0.0,0.0,0.0,0.0,0.00,0.0,0.00,0.0,255.31,22900.0
3,Adamov - Opletalova 38 a 22,B,0.0,100.0,0.0,0.0,0.0,0.460,0.0,0.0,...,0.0,0.0,0.0,0.0,0.00,0.0,641.95,1400.0,641.95,1400.0
4,Adamov - Petra Jilemnického 18 (K 72),B,0.0,100.0,0.0,0.0,0.0,0.090,0.0,0.0,...,0.0,0.0,0.0,0.0,0.00,0.0,333.99,549.0,333.99,549.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3313,Žernůvka,B,0.0,100.0,0.0,0.0,0.0,0.300,0.0,0.0,...,0.0,0.0,0.0,0.0,0.00,0.0,530.89,1200.0,530.89,1200.0
3314,Židlochovice,B,0.0,100.0,0.0,0.0,0.0,0.863,0.0,0.0,...,0.0,0.0,0.0,0.0,0.00,0.0,468.60,3420.0,468.60,3420.0
3315,Žihle,P,0.0,0.0,100.0,0.0,0.0,0.814,0.0,0.0,...,94.0,1000.0,0.0,0.0,0.00,0.0,0.00,0.0,94.00,1000.0
3316,Žinkovy - Domov klidného stáří,P,0.0,100.0,0.0,0.0,0.0,0.490,0.0,0.0,...,0.0,0.0,0.0,0.0,0.00,0.0,449.90,3000.0,449.90,3000.0


In [7]:
ceny.dod_cena.describe()

count    3318.000000
mean      537.033790
std       194.777113
min         0.000000
25%       426.082500
50%       526.350000
75%       633.690000
max      4641.870000
Name: dod_cena, dtype: float64