An application using python and django framework to manage regex rules which are applied on a parquet type of file. The API is built by using the django rest framework library.
Install the requirements
pip install -r requirements.txt
Run the migrations
python manage.py migrate
Create a superuser
python manage.py createsuperuser
The parquet file with name UK_outlet_meal.parquet.gzip must be in the same folder where the project exists.
python manage.py runserver
A sample Dockerfile is provided that will run the application in an isolated environment. The user can create a virtual environment before building the image.
Build the image
docker build -t myproject .
Run the image
docker run -it -p 8000:8000 <image_id>
Create a superuser (inside the docker container)
docker exec -it <container name> bash
python manage.py createsuperuser
python manage.py test
The application consists of an API which can be accessed for CRUD operations concerning the regex patterns. Regex rules can be assigned to brands (At least one brand must exist in order to create a new rule). The second part consists of an application that can be used from a browser in order to search for results in the parquet file depending on regex patterns.
/api/brands/
Allowed methods:
- GET
Lists all brands from database - POST
Creates a new brand
Required fields:- name (charfield)
/api/brands/id/
Allowed methods:
- PUT
Updates a single brand
Required value in the query string:- id (int)
- DELETE
Deletes a single brand
Required value in the query string:- id (int)
/api/rules/
Allowed methods:
- GET
Lists all rules from database - POST
Creates a new rule
Required fields:- description (charfield),
- type_of_search (choices: contains, match_in, match_out),
- pattern (charfield),
- column (charfield),
- brands (list with brand ids)
/api/rules/id/
Allowed methods:
- PUT
Updates a single rule
Required value in the query string:- id (int)
- DELETE
Deletes a single rule
Required value in the query string:- id (int)
/regex-patterns/
Allowed methods: GET, POST
The web page for applying the regex rules to the data.
Required fields:
- type of search (Default finds results that contain the input text, In and Out search for a whole word returning the results which have this word (In) or the results that don't have it(Out))
- column (provide the column on which the regex pattern will be applied to)
- pattern (the regex pattern)
In the current page the user can choose the type of search (first dropdown menu field), the column of the parquet file in which the regex pattern will be applied to and the third field of the form is for defining the regex pattern to be used.