Make sure to use git-lfs
to pull the model checkpoints too alongside the code.
Our data is available on huggingface here!
- If you are using
conda
, consider creating a new environment, and make sure to runconda install pip
so that the following dependencies are installed for your environment. - It's recommended that you use python 3.11.4
- Install dependencies with
poetry install
. - [Optional] Install Spacy module with
python -m spacy download en_core_web_lg
. - Start the poetry shell
poetry shell
. - If you see an error related to
psycopg2-binary
, the easiest solution is probably to install it via pip.
- Use brew to install PostgreSQL 12:
brew install postgresql@12
. - The server should automatically start. You can use brew services to manage it, e.g.,
brew services stop postgresql@12
. - Create DB cluster
initdb
, then create DBcreatedb karl-prod
. - You may need to modify
alembic.ini
to specifysqlalchemy.url
to have your name
- Restore from dump `gunzip -c data/karl-dev.gz | psql karl-prod
- Run
alembic upgrade head
The default PostgreSQL runtime directory is not available on UMIACS machines, so extra steps are required to redirect it.
- Create DB cluster
initdb -D /fs/clip-quiz/shifeng/postgres
- Create the runtime directory
/fs/clip-quiz/shifeng/postgres/run
- Open
/fs/clip-quiz/shifeng/postgres/postgresql.conf
, findunix_socket_directories
and point it to the runtime directory created above. - Start the server
postgres -D /fs/clip-quiz/shifeng/postgres -p 5433
- Create DB
createdb -h /fs/clip-quiz/shifeng/postgres/run -p 5433 karl-prod
- Create dump
pg_dump karl-prod -h /fs/clip-quiz/shifeng/postgres/run -p 5433 | gzip > /fs/clip-quiz/shifeng/db-karl-backup/karl-prod_20210309.gz
- Restore from dump
gunzip -c /fs/clip-quiz/shifeng/db-karl-backup/karl-prod_20210309.gz | psql -p 5433 karl-prod
. Need to drop existing database first.
- Start the scheduler itself:
uvicorn karl.web:app --log-level debug
- If you are using a retention model hosted on UMIACS machine, you can likely use the following command to connect to it
ssh -NfL 8001:localhost:8001 your_name@nexusclip00.umiacs.umd.edu
. An alternative is to use an ssh config - If you are running a local retention model, start it with:
uvicorn karl.retention_hf.web:app --log-level info --port 8001
Add the below code to ~/.ssh/config and you can call ssh clip
going forward to connect to the retention model
Host clip
LocalForward [your MODEL_API_URL] localhost:8001 # my end, other end
User [add user]
Hostname nexusclip00.umiacs.umd.edu
- After
poetry shell
, runpython -m karl.tests.test_scheduling_with_session
.
You need a .env
file in the karl
directory. Modify CODE_DIR
as needed and change shifeng
in SQLALCHEMY_DATABASE_URL
to your user (check via SELECT current_user;
).
Change API_URL
to match with the INTERFACE
variable in the app. You may also need to specify a password to your database url.
CODE_DIR="/Users/shifeng/workspace/fact-repetition"
# Should match with port defined in INTERFACE in karl app .env
API_URL="http://0.0.0.0:8000"
MODEL_API_URL="http://0.0.0.0:8001"
SQLALCHEMY_DATABASE_URL="postgresql+psycopg2://shifeng@localhost:5432/karl-prod"
USE_MULTIPROCESSING=True
MP_CONTEXT="fork"
For example, to keep the development branch DB up to date with master branch.
- Create dump:
pg_dump karl-prod -h /fs/clip-quiz/shifeng/postgres/run -p 5433 | gzip > /fs/clip-quiz/shifeng/karl/backup/karl-prod_20201017.gz
- On the development machine, delete the existing database:
psql -p 5433 karl-dev
thenDROP DATABASE "karl-prod";
- Load the dump
gunzip -c /fs/clip-quiz/shifeng/db-karl-backup/karl-prod_20210309.gz | psql -p 5433 karl-prod
- For figures you might need
vega-lite
. Install with conda:conda install -c conda-forge vega vega-lite vega-embed vega-lite-cli
. - Note, can also be done by doing
npm install -g
for vega, vega-lite, vega-embed, and vega-cli.