Skip to content

Commit

Permalink
Update with upstream/master (#4)
Browse files Browse the repository at this point in the history
* Fix usage examples (J535D165#190)

* Fix broken links (J535D165#186)

This commit fixes broken links in readme.

* Add threshold None and label docstrings for String (J535D165#189)

* Add support for pandas==2 (J535D165#192)

* Replace setup.py by pyproject.toml (J535D165#195)

* Lint with Ruff and format with Black (J535D165#196)

* Lint with Ruff and format with Black

* Fix more lint issues

* Fix datasets submodule

* Fix all lint errors

* Fix importerror

* Replace flake8 in github action by ruff

* Fix linter

* Fix abstractmethod errors

* Fix test with incorrect error

* Update ci-workflow.yml

* Update CI docs generation and CI pipeline (J535D165#197)

* Bump minimal versions of dependencies

* Update the docs CI pipeline (J535D165#198)

* Add requirements to .readthedocs.yaml

* Bump minimal Python version in documentation

* Add pre-commit hooks (J535D165#199)

* Update CI pipeline for publishing package

* disable docs and publish GH actions

* only trigger on PR

* fixed linting

* updated to latest ruff

* Update GitHub Actions workflows

---------

Co-authored-by: Martinho Hoffman <39743428+martinhohoff@users.noreply.github.com>
Co-authored-by: andyjessen <62343929+andyjessen@users.noreply.github.com>
Co-authored-by: David GG <37239554+davidggphy@users.noreply.github.com>
Co-authored-by: Jonathan de Bruin <jonathandebruinos@gmail.com>
  • Loading branch information
5 people committed Feb 10, 2024
1 parent d3cdb24 commit 2199885
Show file tree
Hide file tree
Showing 59 changed files with 425 additions and 3,293 deletions.
1 change: 0 additions & 1 deletion .gitattributes

This file was deleted.

35 changes: 21 additions & 14 deletions .github/workflows/ci-workflow.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name: tests

on: [push, pull_request]

# on: [push, pull_request]
on: [pull_request]
jobs:
build:

Expand All @@ -10,33 +10,40 @@ jobs:
fail-fast: false
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]

pandas-version: ["1.0", "2.0"]
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install pandas
run: |
pip install pandas~=${{ matrix.pandas-version }}
- name: Package recordlinkage
run: |
pip install --upgrade pip
pip install wheel
python setup.py bdist_wheel sdist
pip install build
python -m build
- name: Install recordlinkage
run: |
pip install networkx>=2
pip install ./dist/recordlinkage-*.whl
# - name: Lint with flake8
# run: |
# pip install flake8
# # stop the build if there are Python syntax errors or undefined names
# flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# # exit-zero treats all errors as warnings
# flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest
run: |
pip install pytest
# remove recordlinkage to prevent relative imports (use installed package)
# this is like wrapping stuff in a src folder
rm -r recordlinkage/
pytest
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- name: Install ruff
run: |
pip install ruff
- name: Lint with ruff
run: |
ruff .
55 changes: 11 additions & 44 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
# name: deploy-and-release
# name: Upload Python Package

# on:
# push:
# tags:
# - 'v*' # Push events to matching v*, i.e. v1.0, v20.15.10
# release:
# types: [published]

# permissions:
# contents: read

# jobs:
# deploy:
Expand All @@ -13,50 +16,14 @@
# uses: actions/setup-python@v4
# with:
# python-version: '3.x'
# - name: Get the version (git tag)
# id: get_version
# run: |
# echo ${GITHUB_REF/refs\/tags\/v/}
# echo ::set-output name=VERSION::${GITHUB_REF/refs\/tags\/v/}
# - name: Install dependencies
# run: |
# python -m pip install --upgrade pip
# pip install setuptools wheel
# - name: Build
# run: |
# python setup.py sdist bdist_wheel
# - name: Create Release
# id: create_release
# uses: actions/create-release@v1.0.0
# env:
# GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# with:
# tag_name: ${{ github.ref }}
# release_name: Release ${{ github.ref }}
# draft: false
# prerelease: false
# - name: Upload Release Asset (Wheel)
# id: upload-release-asset-whl
# uses: actions/upload-release-asset@v1.0.1
# env:
# GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# with:
# upload_url: ${{ steps.create_release.outputs.upload_url }}
# asset_path: ./dist/recordlinkage-${{ steps.get_version.outputs.VERSION }}-py3-none-any.whl
# asset_name: recordlinkage-${{ steps.get_version.outputs.VERSION }}-py3-none-any.whl
# asset_content_type: application/x-wheel+zip
# - name: Upload Release Asset (Sdist)
# id: upload-release-asset-sdist
# uses: actions/upload-release-asset@v1.0.1
# env:
# GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# with:
# upload_url: ${{ steps.create_release.outputs.upload_url }}
# asset_path: ./dist/recordlinkage-${{ steps.get_version.outputs.VERSION }}.tar.gz
# asset_name: recordlinkage-${{ steps.get_version.outputs.VERSION }}.tar.gz
# asset_content_type: application/zip
# pip install build
# - name: Build package
# run: python -m build
# - name: Publish package
# uses: pypa/gh-action-pypi-publish@master
# uses: pypa/gh-action-pypi-publish@release/v1
# with:
# user: __token__
# password: ${{ secrets.pypi_password }}
25 changes: 9 additions & 16 deletions .github/workflows/render-docs.yml
Original file line number Diff line number Diff line change
@@ -1,26 +1,19 @@
# name: Build HTML on macOS
# name: Build HTML with Sphinx
# on: [push, pull_request]
# jobs:
# html-macos:
# runs-on: macos-latest
# html-sphinx:
# runs-on: ubuntu-latest
# steps:
# - name: Clone repo
# uses: actions/checkout@v3
# with:
# fetch-depth: 0
# - name: Install pandoc
# run: |
# brew install pandoc
# uses: actions/checkout@v2
# - name: Set up Python
# uses: actions/setup-python@v4
# uses: actions/setup-python@v2
# with:
# python-version: '3.8'
# - name: Install recordlinkage
# run: |
# python -m pip install .[all]
# - name: Install docs dependencies
# python-version: '3.10'
# - name: Install recordlinkage and docs tools
# run: |
# python -m pip install -r docs/requirements.txt
# sudo apt install pandoc
# python -m pip install .[docs]
# - name: Build HTML
# run: |
# python -m sphinx -W --keep-going --color docs/ _build/html/
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@

recordlinkage/datasets/krebsregister/*

recordlinkage/_version.py


.DS_Store
*/.DS_Store
Expand Down
9 changes: 4 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,13 @@ repos:
rev: 24.1.1
hooks:
- id: black
exclude: versioneer.py
- repo: https://github.com/asottile/pyupgrade
rev: v3.15.0
hooks:
- id: pyupgrade
args: [--py38-plus]
exclude: versioneer.py
- repo: https://github.com/PyCQA/flake8
rev: 7.0.0
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.2.1
hooks:
- id: flake8
- id: ruff
args: [--fix]
16 changes: 16 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
version: 2

build:
os: ubuntu-22.04
tools:
python: "3.11"

sphinx:
configuration: docs/conf.py

python:
install:
- method: pip
path: .
extra_requirements:
- docs
4 changes: 0 additions & 4 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,9 +1,5 @@
include versioneer.py
include recordlinkage/_version.py

recursive-include recordlinkage/datasets/febrl *.csv
recursive-include recordlinkage/datasets/krebsregister *.csv


global-exclude test_*.py
global-exclude *_test.py
9 changes: 3 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,24 +120,21 @@ The main features of this Python record linkage toolkit are:
The most recent documentation and API reference can be found at
[recordlinkage.readthedocs.org](http://recordlinkage.readthedocs.org/en/latest/).
The documentation provides some basic usage examples like
[deduplication](http://recordlinkage.readthedocs.io/en/latest/notebooks/data_deduplication.html)
[deduplication](http://recordlinkage.readthedocs.io/en/latest/guides/data_deduplication.html)
and
[linking](http://recordlinkage.readthedocs.io/en/latest/notebooks/link_two_dataframes.html)
[linking](http://recordlinkage.readthedocs.io/en/latest/guides/link_two_dataframes.html)
census data. More examples are coming soon. If you do have interesting
examples to share, let us know.

## Installation

The Python Record linkage Toolkit requires Python 3.6 or higher. Install the
The Python Record linkage Toolkit requires Python 3.8 or higher. Install the
package easily with pip

``` sh
pip install recordlinkage
```

Python 2.7 users can use version \<= 0.13, but it is advised to use
Python \>= 3.5.

The toolkit depends on popular packages like
[Pandas](https://github.com/pydata/pandas),
[Numpy](http://www.numpy.org), [Scipy](https://www.scipy.org/) and,
Expand Down
3 changes: 2 additions & 1 deletion benchmarks/bench_comparing.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import recordlinkage as rl
from recordlinkage.datasets import load_febrl1, load_febrl4
from recordlinkage.datasets import load_febrl1
from recordlinkage.datasets import load_febrl4


class CompareRecordLinkage:
Expand Down
3 changes: 2 additions & 1 deletion benchmarks/bench_indexing.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import recordlinkage as rl
from recordlinkage.datasets import load_febrl1, load_febrl4
from recordlinkage.datasets import load_febrl1
from recordlinkage.datasets import load_febrl4


class PairsRecordLinkage:
Expand Down

0 comments on commit 2199885

Please sign in to comment.