Skip to content

Commit

Permalink
Merge 19d2c0b into d3aae89
Browse files Browse the repository at this point in the history
  • Loading branch information
mcarans committed Oct 11, 2023
2 parents d3aae89 + 19d2c0b commit c86661b
Show file tree
Hide file tree
Showing 9 changed files with 174 additions and 58 deletions.
2 changes: 1 addition & 1 deletion .config/coveragerc
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ exclude_also =
if 0:
if __name__ == .__main__.:
if TYPE_CHECKING:
@(abc\.)?abstractmethod
@(abc\.)?abstractmethod
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ you make a git commit:

pre-commit install

The configuration file for this project is in a
The configuration file for this project is in a
non-start location. Thus, you will need to edit your
`.git/hooks/pre-commit` file to reflect this. Change
the line that begins with `ARGS` to:
Expand All @@ -29,7 +29,7 @@ the line that begins with `ARGS` to:

With pre-commit, all code is formatted according to
[black]("https://github.com/psf/black") and
[ruff]("https://github.com/charliermarsh/ruff") guidelines.
[ruff]("https://github.com/charliermarsh/ruff") guidelines.

To check if your changes pass pre-commit without committing, run:

Expand All @@ -46,8 +46,8 @@ Follow the example set out already in ``api.rst`` as you write the documentation
## Packages

[pip-tools](https://github.com/jazzband/pip-tools) is used for
package management. If you’ve introduced a new package to the
source code (i.e.anywhere in `src/`), please add it to the
package management. If you’ve introduced a new package to the
source code (i.e.anywhere in `src/`), please add it to the
`project.dependencies` section of
`pyproject.toml` with any known version constraints.

Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
[![Downloads](https://img.shields.io/pypi/dm/hdx-python-database.svg)](https://pypistats.org/packages/hdx-python-database)

The HDX Python Database Library provides utilities for connecting to databases in a standard way including
through an ssh tunnel if needed. It is built on top of SQLAlchemy and simplifies its setup.
through an ssh tunnel if needed. It is built on top of SQLAlchemy and simplifies its setup.

For more information, please read the [documentation](https://hdx-python-database.readthedocs.io/en/latest/).

This library is part of the [Humanitarian Data Exchange](https://data.humdata.org/) (HDX) project. If you have
humanitarian related data, please upload your datasets to HDX.
This library is part of the [Humanitarian Data Exchange](https://data.humdata.org/) (HDX) project. If you have
humanitarian related data, please upload your datasets to HDX.
77 changes: 52 additions & 25 deletions documentation/main.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,31 +5,38 @@ through an ssh tunnel if needed. It is built on top of SQLAlchemy and simplifies

# Information

This library is part of the [Humanitarian Data Exchange](https://data.humdata.org/) (HDX) project. If you have
This library is part of the [Humanitarian Data Exchange](https://data.humdata.org/) (HDX) project. If you have
humanitarian related data, please upload your datasets to HDX.

The code for the library is [here](https://github.com/OCHA-DAP/hdx-python-database).
The library has detailed API documentation which can be found in the menu at the top.
The library has detailed API documentation which can be found in the menu at the top.

Additional postgresql functionality is available if this library is installed with extra "postgresql":
Connecting to databases through an ssh tunnel is only available if this library
is installed with extra `sshtunnel`:

pip install hdx-python-database[sshtunnel]

The `wait_for_postgresql` function is only available if this library is
installed with extra `postgresql`:

pip install hdx-python-database[postgresql]

## Breaking changes
From 1.2.8, the sshtunnel dependency is optional.

From 1.2.7, default table names are no longer plural. The camel case class name
From 1.2.7, default table names are no longer plural. The camel case class name
is converted to snake case, for example `MyTable` becomes `my_table`.

From 1.2.3, Base must be chosen from `hdx.database.no_timezone`
(`db_has_tz=False`: the default) or `hdx.database.with_timezone`
From 1.2.3, Base must be chosen from `hdx.database.no_timezone`
(`db_has_tz=False`: the default) or `hdx.database.with_timezone`
(`db_has_tz=True`).

From 1.2.2, database datetime columns are assumed to be timezoneless unless
db_has_tz is set to True.

From 1.2.1, wait_for_postgresql takes connection URI not database parameters,
From 1.2.1, wait_for_postgresql takes connection URI not database parameters,
get_params_from_sqlalchemy_url renamed to get_params_from_connection_uri
and moved to dburi module, get_sqlalchemy_url renamed to get_connection_uri and
and moved to dburi module, get_sqlalchemy_url renamed to get_connection_uri and
moved to dburi module. New function remove_driver_from_uri in dburi module.
Parameter driver replaced by dialect+driver. Supports Python 3.8 and later.

Expand All @@ -38,7 +45,7 @@ is renamed wait_for_postgresql.

Versions from 1.0.8 support Python 3.6 and later.

Versions from 1.0.6 no longer support Python 2.7.
Versions from 1.0.6 no longer support Python 2.7.

# Description of Utilities

Expand All @@ -51,40 +58,40 @@ Your SQLAlchemy database tables must inherit from `Base` in
class MyTable(Base):
my_col: Mapped[int] = mapped_column(Integer, ForeignKey(MyTable2.col2), primary_key=True)

A default table name is set which can be overridden: it is the camel case class
A default table name is set which can be overridden: it is the camel case class
name to converted to snake case, for example `MyTable` becomes `my_table`.

Then a connection can be made to a database as follows including through an SSH
tunnel:
tunnel (which requires installing `hdx-python-database[sshtunnel]`):

# Get SQLAlchemy session object given database parameters and
# if needed SSH parameters. If database is PostgreSQL, will poll
# till it is up.
from hdx.database import Database
with Database(database="db", host="1.2.3.4", username="user",
password="pass", dialect="dialect", driver="driver",
ssh_host="5.6.7.8", ssh_port=2222, ssh_username="sshuser",
ssh_private_key="path_to_key", db_has_tz=True,
with Database(database="db", host="1.2.3.4", username="user",
password="pass", dialect="dialect", driver="driver",
ssh_host="5.6.7.8", ssh_port=2222, ssh_username="sshuser",
ssh_private_key="path_to_key", db_has_tz=True,
reflect=False) as session:
session.query(...)

`db_has_tz` which defaults to `False` indicates whether database datetime
columns have timezones. If `db_has_tz` is `True`, use `Base` from
`hdx.database.with_timezone`, otherwise use `Base` from
`hdx.database.no_timezone`. If `db_has_tz` is `False`, conversion occurs
`db_has_tz` which defaults to `False` indicates whether database datetime
columns have timezones. If `db_has_tz` is `True`, use `Base` from
`hdx.database.with_timezone`, otherwise use `Base` from
`hdx.database.no_timezone`. If `db_has_tz` is `False`, conversion occurs
between Python datetimes with timezones to timezoneless database columns.

If `reflect` (which defaults to `False`) is `True`, classes will be reflected
from an existing database and the reflected classes are returned in a variable
`reflected_classes` in the returned Session object. Note that type annotation
If `reflect` (which defaults to `False`) is `True`, classes will be reflected
from an existing database and the reflected classes are returned in a variable
`reflected_classes` in the returned Session object. Note that type annotation
maps don't work with reflection and hence `db_has_tz` will be ignored ie.
there will be no conversion between Python datetimes with timezones to
there will be no conversion between Python datetimes with timezones to
timezoneless database columns.

## Connection URI

There are functions to handle converting from connection URIs to parameters and
vice-versa as well as a function to remove the driver string from a connection
vice-versa as well as a function to remove the driver string from a connection
URI that contains both dialect and driver.

# Extract dictionary of parameters from database connection URI
Expand All @@ -108,6 +115,27 @@ URI that contains both dialect and driver.
"postgresql+psycopg://myuser:mypass@myserver:1234/mydatabase"
)

## Views

The method to make views described [here](https://github.com/sqlalchemy/sqlalchemy/wiki/Views#sqlalchemy-14-20-version)
is available in this library.

This allows creating views like this:
```
class DBOrgType(Base):
__tablename__ = "org_type"
code: Mapped[str] = mapped_column(String(32), primary_key=True)
description: Mapped[str] = mapped_column(String(512), nullable=False)
org_type_view = view(
name="org_type_view",
metadata=Base.metadata,
selectable=select(*DBOrgType.__table__.columns),
)
```

## PostgreSQL specific

There is a PostgreSQL specific call that only returns when the PostgreSQL database
Expand All @@ -116,4 +144,3 @@ is available:
# Wait until PostgreSQL is up
# Library should be installed with hdx-python-database[postgresql]
wait_for_postgresql("mydatabase", "myserver", 5432, "myuser", "mypass")

6 changes: 3 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,7 @@ classifiers = [
requires-python = ">=3.8"

dependencies = [
"sqlalchemy>= 2",
"sshtunnel",
"sqlalchemy>=2",
]
dynamic = ["version"]

Expand All @@ -47,6 +46,7 @@ content-type = "text/markdown"
Homepage = "https://github.com/OCHA-DAP/hdx-python-database"

[project.optional-dependencies]
sshtunnel = ["sshtunnel"]
postgresql = ["psycopg[binary]"]
test = ["pytest", "pytest-cov"]
dev = ["pre-commit"]
Expand Down Expand Up @@ -76,7 +76,7 @@ version_scheme = "python-simplified-semver"
# Tests

[tool.hatch.envs.test]
features = ["postgresql", "test"]
features = ["sshtunnel", "postgresql", "test"]

[tool.hatch.envs.test.scripts]
test = """
Expand Down
40 changes: 20 additions & 20 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,63 +6,63 @@
#
bcrypt==4.0.1
# via paramiko
cffi==1.15.1
cffi==1.16.0
# via
# cryptography
# pynacl
cfgv==3.3.1
cfgv==3.4.0
# via pre-commit
coverage[toml]==7.2.7
coverage[toml]==7.3.2
# via pytest-cov
cryptography==41.0.1
cryptography==41.0.4
# via paramiko
distlib==0.3.6
distlib==0.3.7
# via virtualenv
filelock==3.12.2
filelock==3.12.4
# via virtualenv
greenlet==2.0.2
greenlet==3.0.0
# via sqlalchemy
identify==2.5.24
identify==2.5.30
# via pre-commit
iniconfig==2.0.0
# via pytest
nodeenv==1.8.0
# via pre-commit
packaging==23.1
packaging==23.2
# via pytest
paramiko==3.2.0
paramiko==3.3.1
# via sshtunnel
platformdirs==3.8.0
platformdirs==3.11.0
# via virtualenv
pluggy==1.2.0
pluggy==1.3.0
# via pytest
pre-commit==3.3.3
pre-commit==3.4.0
# via hdx-python-database (pyproject.toml)
psycopg[binary]==3.1.9
psycopg[binary]==3.1.12
# via hdx-python-database (pyproject.toml)
psycopg-binary==3.1.9
psycopg-binary==3.1.12
# via psycopg
pycparser==2.21
# via cffi
pynacl==1.5.0
# via paramiko
pytest==7.4.0
pytest==7.4.2
# via
# hdx-python-database (pyproject.toml)
# pytest-cov
pytest-cov==4.1.0
# via hdx-python-database (pyproject.toml)
pyyaml==6.0
pyyaml==6.0.1
# via pre-commit
sqlalchemy==2.0.17
sqlalchemy==2.0.21
# via hdx-python-database (pyproject.toml)
sshtunnel==0.4.0
# via hdx-python-database (pyproject.toml)
typing-extensions==4.6.3
typing-extensions==4.8.0
# via
# psycopg
# sqlalchemy
virtualenv==20.23.1
virtualenv==20.24.5
# via pre-commit

# The following packages are considered to be unsafe in a requirements file:
Expand Down
11 changes: 9 additions & 2 deletions src/hdx/database/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import DeclarativeBase, Session, sessionmaker
from sqlalchemy.pool import NullPool
from sshtunnel import SSHTunnelForwarder

from ._version import version as __version__ # noqa: F401
from .dburi import get_connection_uri
Expand Down Expand Up @@ -62,6 +61,14 @@ def __init__(
if port is not None:
port = int(port)
if len(kwargs) != 0:
try:
import sshtunnel
except ImportError:
# dependency missing, log an error
logger.error(
"sshtunnel not found! Please install hdx-python-database[sshtunnel] to enable."
)
raise
ssh_host = kwargs["ssh_host"]
del kwargs["ssh_host"]
ssh_port = kwargs.get("ssh_port")
Expand All @@ -70,7 +77,7 @@ def __init__(
del kwargs["ssh_port"]
else:
ssh_port = 22
self.server = SSHTunnelForwarder(
self.server = sshtunnel.SSHTunnelForwarder(
(ssh_host, ssh_port),
remote_bind_address=(host, port),
**kwargs,
Expand Down

0 comments on commit c86661b

Please sign in to comment.