PyPGIO - An IO generator for PostgreSQL based on the original pgio and SLOB
PyPGIO is a Python based I/O generator for PostgreSQL databases. The only purpose is to drive (a lot of) I/O from a database without requiring a lot of CPU resources.
It is based on, and uses some code from PGIO by Kevin Closson, see PGIO for more details.
- Linux shell account (can be different system than the database server)
$HOME/bin
directory that is in$PATH
- PostgreSQL database (any recent supported version)
- User on the database with full access rights on the database
- Python 3.6 or higher
python3
must be in the environment- Internet access to install Python PIP packages
Assuming the prerequisites are met, the easiest way to install the latest version of pypgio is to run the downloader command on your host:
# Test if python3 works
python3 --version
# This requires internet access via https
# Inspect downloader (optional)
curl https://raw.githubusercontent.com/bsjerps/pypgio/master/scripts/download | less
# Download using downloader
curl https://raw.githubusercontent.com/bsjerps/pypgio/master/scripts/download | bash
# Check if pgio is installed
ls -al ~/bin/pgio
Note that PyPGIO is designed to run as a ZipApp package. Running directly from the git repository is not recommended. Instead, create the package from the repo:
# Clone the repository
git clone https://github.com/bsjerps/pypgio.git
cd pypgio
scripts/mkapp
ls -al ~/bin/pgio
See create_db.sql for details
# Run pgio
pgio
# Run installer (this sets up the virtual environment and bash completion)
pgio install
# Logout and login again - bash completion should now work
# Or source the completions
source ~/.bash_completion
# Show help summary: Run pgio -h
pgio -h
# Show default configuration settings
pgio configure
# Show configuration parameters
pgio configure -h
# Change the database host
pgio configure --dbhost pgserver.lan
# Test connection
pgio
Assuming we have a database named pgio with a user pgio and password pgio. By default, pgio uses the public table with the default schema.
On RHEL, the database top directory is /var/lib/pgsql/<version>/data
But in many cases, we want the (large) pgio tables on another filesystem. In our example we have a file system /pgdata
on which we created a PostgreSQL tablespace bulk
and we want our tables to be generated there.
# Set schemas and scale
pgio configure --schemas 8 --scale 256M
# Optionally set the default tablespace
pgio configure --tablespace bulk
# Create database structure and tables, using 4 parallel threads
pgio setup 4
# Check the tables
pgio list
Now we can run pgio:
# Run pgio for 60 seconds with 8 threads
pgio run 60 8
# Change the update percentage
pgio configure --update_pct 10
# Run pgio for 60 seconds with 4 threads
pgio run 60 4
# View detailed reports
pgio report -v
Deinstallation involves the removal of all pgio related stuff in the database, and the virtual environment and shell settings in the user home directory
# Delete the database structures
pgio destroy
# Remove bash setup and virtual environment
pgio uninstall
# Remove pgio from $HOME/bin
rm ~/bin/pgio