Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 93 additions & 0 deletions .github/workflows/prerelease.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
name: Prerelease workflow

on:
pull_request:
branches:
- 'main'

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8]

steps:
- uses: actions/checkout@v4
with:
persist-credentials: false # use GITHUB_TOKEN
fetch-depth: 1 # fetch depth is nr of commits
ref: ${{ github.head_ref }}

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pyriksdagen
pip install pdoc

- name: Get release type
run: |
release_type=$(echo ${{ github.event.pull_request.title }} | grep -Eoh '((M|m)(ajor|inor)|(P|p)atch)' | awk '{print tolower($0)}')
if [ -z $release_type ] ; then
echo "You have to indicate the release type in the title of a pr to main"
echo " suggested: `prerelease: major|minor|patch` version"
exit 1
else
echo "Next release will be a $release_type version"
echo "RELEASE_TYPE=$release_type" >> $GITHUB_ENV
fi

- name: Install jq
uses: dcarbone/install-jq-action@v2
with:
version: 1.7
force: false

- name: Get most recent release
run: |
LATEST_RELEASE=$(echo "$(curl -L https://api.github.com/repos/swerik-project/scripts/releases/latest)" | jq -r .tag_name)
if [[ "$LATEST_RELEASE" == null ]] ; then LATEST_RELEASE="v0.0.0" ; fi
echo "LAST_RELEASE=$LATEST_RELEASE" >> $GITHUB_ENV

- name: Bump version
id: bump
uses: cbrgm/semver-bump-action@main
with:
current-version: ${{ env.LAST_RELEASE }}
bump-level: ${{ env.RELEASE_TYPE }}

- name: bump to env
run: |
release_nr=${{ steps.bump.outputs.new_version }}
echo "RELEASE_NR=$release_nr" >> $GITHUB_ENV

- name: Build documentation
run: |
echo "Release version ${{ env.RELEASE_NR }}"
pdoc -o docs --footer-text ${{ env.RELEASE_NR }} -t docs/dark-mode --logo https://raw.githubusercontent.com/swerik-project/the-swedish-parliament-corpus/refs/heads/main/readme/riksdagshuset.jpg ../scripts

- name: Add and commit changes
run: |
git config --local user.email "41898282+github-actions[bot]@users.noreply.github.com"
git config --local user.name "github-actions[bot]"
if [[ `git status docs/ --porcelain --untracked-files=no` ]]; then
git add docs
git commit -m "chore (workflow): update documentation"
else
echo ""
echo "::warning:: WARNING!!! No changes to documentation files."
echo " Double check the version nr and everything else is up to date."
echo ""
git commit --allow-empty -m "chore (workflow): no changes to documentation"
fi

- name: Push changes
uses: ad-m/github-push-action@master
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
branch: ${{ github.head_ref }}
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ pyriksdagen/source/
input/KWIC/*.csv
input/rawpdf/*
**/_*

!**/__init__.py
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
# Scripts – Data curation and processing logic for the Swedish Parliament Corpus

This contains various scripts for curating and working with data from Swedish Riksdag. This repo is "internal" in some sense -- we make no effort to maintain compatibility or to provide really thorough documentation, and this repo is not intended as part of the project's API. Nevertheless, we feel that users might find some utility in these example scripts.

## General setup and use

The general recommendation is to set up a python virtual environment for working with this data set and these scripts. Do that how you like -- below is just one example of how it can be done. We're working with Python 3.8 due to compatibility issues with e.g. tensor flow.

### Setting up an environment

Set up a conda environment : Follow the steps [here](https://www.tensorflow.org/install/pip).
Expand All @@ -26,10 +30,12 @@ The `LazyArchive()` class attempts to connect to the KB labs in the lazyest way
KBLUSER=
KBLPASS=

We are phasing out reliance on kblabb servers, and this will soon be deprecated.

They can be added to the environment variables, e.g. `~/miniconda3/envs/tf/etc/conda/activate.d/env_vars.sh`. If these are not present, you will be prompted for the username and password.


## Curating data
## Curating Records data


Most scripts take `--start` YEAR and `--end` YEAR arguments to define a span of time to operate on. Other options are noted in with the file below.
Expand Down
3 changes: 3 additions & 0 deletions __init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
"""
.. include:: README.md
"""
Loading