# SQL for accessing spatial data on postgreSQL

データベースシステム講義資料  
version 0.0.1   
authors: H. Chenan & N. Tsutsumida  

Copyright (c) 2023 Narumasa Tsutsumida  
Released under the MIT license  
https://opensource.org/licenses/mit-license.php  

## Task

F6. 埼玉県内の全鉄道駅において、2019年4月（休日・昼間）と2020年4月（休日・昼間）の人口増減率 ((pop_202004 - pop_201904)/pop_201904)を、小さい順に並べ、最初の10件を示せ。（出力は県名、駅名、人口増減率とすること）


## prerequisites

In [1]:
import os
from sqlalchemy import create_engine
import pandas as pd
import geopandas as gpd
import numpy as np
import folium
pd.set_option('display.max_columns', 100)




In [2]:
def query_pandas(sql, db):
    """
    Executes a SQL query on a PostgreSQL database and returns the result as a Pandas DataFrame.

    Args:
        sql (str): The SQL query to execute.
        db (str): The name of the PostgreSQL database to connect to.

    Returns:
        pandas.DataFrame: The result of the SQL query as a Pandas DataFrame.
    """

    DATABASE_URL='postgresql://postgres:postgres@postgis_container:5432/{}'.format(db)
    conn = create_engine(DATABASE_URL)

    df = pd.read_sql(sql=sql, con=conn)

    return df

## Define a sql command

In [3]:
sql = """
    WITH
        pop2020 AS (
            SELECT p.name, d.year, d.month, d.prefcode, d.population, p.geom
            FROM pop AS d
            INNER JOIN pop_mesh AS p
                ON p.name = d.mesh1kmid
            WHERE d.dayflag='0' AND
                d.timezone='0' AND
                d.year='2020' AND
                d.month='04' AND
                d.prefcode='11'
            ),
        pop2019 AS (
            SELECT p.name, d.year, d.month, d.prefcode, d.population, p.geom
            FROM pop AS d
            INNER JOIN pop_mesh AS p
                ON p.name = d.mesh1kmid
            WHERE d.dayflag='0' AND
                d.timezone='0' AND
                d.year='2019' AND
                d.month='04' AND
                d.prefcode='11'
            ),
        station AS (
            SELECT DISTINCT pt.name, pt.way
            FROM planet_osm_point pt
            WHERE pt.railway='station' AND
                pt.name IS NOT NULL
            ),
        s20 AS (
            SELECT pop2020.prefcode, station.name, SUM(pop2020.population) AS population
            FROM station
            INNER JOIN pop2020
                ON st_within(station.way,st_transform(pop2020.geom, 3857))
            GROUP BY station.name, pop2020.prefcode
            ),
        s19 AS (
            SELECT pop2019.prefcode, station.name, SUM(pop2019.population) AS population
            FROM station
            INNER JOIN pop2019
                ON st_within(station.way,st_transform(pop2019.geom, 3857))
            GROUP BY station.name, pop2019.prefcode
            )
    SELECT
        CASE
            WHEN s20.prefcode = '11' THEN '埼玉'
            ELSE 'その他'
        END AS pref_name,
        s20.name AS station_name,
        ((s20.population - s19.population) / s19.population) AS change_rate
    FROM s20
    INNER JOIN s19
        ON s20.name=s19.name
    ORDER BY change_rate
    LIMIT 10
    ;
    
"""


## Outputs

In [4]:
out = query_pandas(sql,'gisdb')
print(out)

  pref_name station_name  change_rate
0        埼玉     ハートフルランド    -0.945013
1        埼玉          三峰口    -0.908116
2        埼玉        西武球場前    -0.872104
3        埼玉           白久    -0.823887
4        埼玉          西吾野    -0.750000
5        埼玉           用土    -0.736264
6        埼玉           竹沢    -0.722488
7        埼玉          新三郷    -0.704125
8        埼玉          大麻生    -0.692568
9        埼玉      さいたま新都心    -0.619451
