The project is intended to build reliable and extendable platform for automatic code clone detection. While the primary use-cases is considered to be student homework cheating detection, it can be used to automate clone detection of any programs.
.
├── bin -- binary utilities
├── detector -- CCD service
├── docker-compose.yml
├── docs
├── .env.sample -- primary dotenv template
├── frontier -- HTTP gateway service
├── LICENSE -- MIT :)
├── mutator -- python automatic mutations injections framework
├── nginx -- nginx configuration
├── .postgresql.env.sample -- PostgreSQL dotenv template
├── README.md -- You are here
├── .s3.env.sample -- S3 dotenv template
├── tokenizer -- language-dependent tokenization implementations
├── .tool-versions -- consider using asdf
└── .volumes -- local docker volumes
The instructions below were tested on Ubuntu 22.04 with the following packages installed:
- Docker Engine (Community) 26.0.0 with compose plugin (2.25.0)
Make sure you have docker installed, other versions are highly likely to work too.
-
Clone this git repository:
git clone https://github.com/studyfair/studyfair.git
-
Enter the repository and build docker images:
cd studyfair && docker compose build --pull
-
Although the system is built with intention of begin language-agnostic, currently each language (ATM only python) requires its own engine. To build the python engine, one should execute
docker build --tag tokenizer-python:mainline tokenizer/python
Now you should configure credentials for each service.
Start with creating VCS-ignored dotenv files:
cp .env.sample .env
cp .postgresql.env.sample .postgresql.env
cp .s3.env.sample .s3.env
Open .env
and fill in the following variables:
- set
S3_ACCESS_KEY_ID
viaopenssl rand -hex 8
; - set
S3_SECRET_ACCESS_KEY
viaopenssl rand -hex 8
; - set
DETECTOR_ACCESS_TOKEN
viaopenssl rand -hex 16
; - set
DETECTOR_WEBHOOK_ACCESS_TOKEN
viaopenssl rand -hex 16
; - set
SECRET_KEY_BASE
viaopenssl rand -hex 64
; - if your are going to use Telegram Bot integration, set
TELEGRAM_BOT_API_TOKEN
to the corresponding value.
-
(optional) Change Minio root user credentials in
.s3.env
. -
(optional) Change PostgreSQL DSN options in
.postgresql.env
. Make sure it's aligned with.env
POSTGRES_*
variables.
Generate certificates by running
openssl req -newkey rsa:2048 -sha256 -nodes -x509 -days 365 \
-keyout ca.key \
-out ca.crt \
-subj "/C=RU/ST=Saint-Petersburg/L=Saint-Petersburg/O=Example Inc/CN=<HOSTNAME>" \
&& mv ca.{key,crt} nginx/certificates
don't forget to replace <HOSTNAME>
with your domain name, IP address or localhost
.
-
Run
docker compose up -d
. Wait few seconds and make sure all is working as expected viadocker compose ps -a
. -
(hopefully I'l make this step at least semi-automatic)
To configure Minio buckets, visit http://localhost:9001/login, login via username & password mentioned in
.s3.env
and create a bucket namedproduction
. Change it's visibility (aka "access policy" topublic
.Now open "access keys" -> "created access key" and fill in the form with the values from
S3_ACCESS_KEY_ID
andS3_SECRET_ACCESS_KEY
. -
(optional) If your are going to use Telegram Bot integration, you should have public IP address available. If you don't have one, you might use ngrok, CF tunnel or any other similar tool. For example, if you're using
ngrok
simply runngrok http https://localhost:443
. You should set webhook URL viadocker compose exec frontier-web bundle exec rake telegram:bot:set_webhook[https://your-public-ip-address]
NOTE: enlightened zsh users will have to escape
[
and]
:docker compose exec frontier-web bundle exec rake telegram:bot:set_webhook\[https://your-public-ip-address\]
You'll need to create a user to log in. Run the following command:
docker compose exec frontier-web bundle exec rails db:seed
Visit https://localhost/admin and login via credentials mentioned in frontier/db/seeds.rb
.
See this README for details.
Instead of relying on WEB UI one might choose to use RESTful HTTP API. Its OpenAPI specification is available at frontier
service documentation and is auto-generated on server startup at /api-docs
endpoint.