-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC0014] Automated Image-cropping Pipeline #30
Comments
Optimize images for Full HD on low bandwidth:
|
I think the initial concepts are a bit off:
|
okay got it, the images we are dealing with are from the s3 server with the bucket name |
yes, that's my point, that's why I thought the sentence
(towards the beginning) should be replaced |
Click here for Docs
Table of Contents
Housekeeping
RFC0014 Automated Image-cropping Pipeline
ALL BELOW FIELDS ARE REQUIRED
Named Concepts
image-processing.bdrc.io
and And the system or model should work any type of images regardless of whether its in pecha format or modern publications format.Summary
BDRC has many images that contain several Pecha pages. We need to automate the image-cropping process with a custom computer vision model. This project will use Prodigy as a human-in-the-loop pipeline to create an initial training dataset, train a model, and iteratively improve it.Reference-Level Explanation
**System Diagram:**In this command line:
We used the command-line interface with a built-in image.manual recipe with image_dataset and manually write the boundary of each image in the image_dataset.
Preparing the training dataset
Here we manually write boundaries to each image for the training dataset. We can make it work faster by making it available to more people by deploying the model to a web using AWS and more people can take part in drawing the boundary around the PECHA at the same time.
Check whether we have enough training datasets to train the model
It will print the accuracy figures and accuracy figures and accuracy improvements with more data. This recipe takes pretty much the same arguments as the train.
Train the model
You can use the training recipe to train within the prodigy or outside using spaCy or other NLP packages
--ner -> telling prodigy that you are doing a NER
ds_GOLD -> name of the local dataset that has your manual annotation
./tmp_model -> path to where prodigy will create your model
--eval-split -> the train test split ratio is what you want prodigy to split the annotation in your dataset into.
Human in the Loop
Once we have a basic model, you can exponentially speed up the cropping process by letting the model try to do the rest of the image cropping.
The model will take over and crop the rest of the dataset and binarizes the decision process into an ACCEPT or REJECT for you.
If we notice that the model is not doing the cropping job well then this training dataset for cropping needs further correction by opting for the ner.correct recipe.
Prodigy output
Through the prodigy, we will get the coordinates of the border image
Image Cropping
By using the Python Image Library (PIL) which provides the python interpreter with image editing capability. This Library can crop the image base on the coordinates that we have from the prodigy.
Alternatives
Manually cropping each image to the borders but BDRC doesn't have the human power to do this work manually.Rationale
- Why the currently proposed design was selected over alternatives? - First manually cropping each image is a tedious job and we don't have the manpower to do it - Doing it without deploying it to the AWS will delay the completion date. - What would be the impact of going with one of the alternative approaches? - Based on my understanding this would be a better solution. - Is the evaluation tentative, or is it recommended to use more time to evaluate different approaches? - YesDrawbacks
Need AWS to host the image and a domain to make it available to other people on the internet to draw the boundary.
Useful References
-Prodigy [Prodi.gy](https://prodi.gy/docs/computer-vision) [Using prodigy for NLP text annotation](https://medium.com/mlearning-ai/using-prodigy-for-nlp-text-annotation-revolution-ai-for-spacy-e5561d93a361) [SpaCy v3.4 documentation ](https://spacy.io/usage/v3-4)Unresolved Questions
- What is there that is unresolved (and will be resolved as part of fulfilling this request)?Prodigy is mostly used for Named Entity Recognition (NER) hence most of the documentation and online article are about
the same.
When I goes through the documentation they didn't specifically instruct the same with instruction for the image. Hence,
there is no way of confirming that it will work the same for the image as well.
No to my knowledge.
Parts of the System Affected
Future possibilities
- We can run the prodigy and the system built around it. We don't have to crop an image by ourselves. - The model will crop the image and save the image in the same format in a specified location.Infrastructure
**Front end** - No need to do anything because the prodigy has a web interface to draw the rectangle or polygon shape onto the image coordinate.Backend
Testing
We will measure the performance of the model by training the model and testing it on the remaining images of PECHA. Will check the accuracy by using the teach and correct recipe.Documentation
User documentation
Developer documentation
Version History
Recordings
- NoneWork Phases
Pre-processing of Image from BDRC's s3 server
according the requirements from ELie as a dict of image_options
the image should be checked if it is binary or non-binary and name the processed file accordingly
if binary: new_filename = origfilename + "_" + str(degree) + ".png"
else: new_filename = origfilename + "_" + str(degree) + ".jpg"
procesed images will be upaloaded back to s3, to use in the prodigy
Creating a custome recipe as per our requirement
encode image's s3 URL in the JSONL file for the prodigy server to use in the recipe
Non-Coding
Keep the original naming and structure, and keep it as the first section in the Work phases section
Implementation
A list of checkboxes, one per PR. Each PR should have a descriptive name that clearly illustrates what the work phase is about.
pipeline for processing and getting the images ready for the prodigy server
@ta4tsering
the maximum width should be 2000
the maximum height should be 700
the quality of the image should be 75%
the image should be encoded using progressive encoding, default True
the image should be converted to greyscale if the greyscale is True, default is False
the image should be checked if it is binary or non-binary and name the processed file accordingly
@ta4tsering
creating the recipe to stream images directly from s3 to prodigy server
@Zakongjampa
alternative method to stream s3 images into the prodigy using JSONL
Training prodigy image-cropping model
@Zakongjampa
@ta4tsering
@Zakongjampa
Human-in-loop to annotate or crop the images
Naming convention for the annotated images output from the prodigy
Example input: .jpg, I8LS766730003.jpg
Example output: _1.jpg, _2.jpg, I8LS766730003_1.jpg, I8LS766730003_2.jpg
train the model
Quality control of the prodigy model
good-enough
model is trainedTrain using Tensor flow object detection API
@ta4tsering
Tests
@ta4tsering
The text was updated successfully, but these errors were encountered: