Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework - IMPORTANT #64

Open
Breta01 opened this issue Nov 20, 2018 · 5 comments
Open

Rework - IMPORTANT #64

Breta01 opened this issue Nov 20, 2018 · 5 comments

Comments

@Breta01
Copy link
Owner

Breta01 commented Nov 20, 2018

Currently, the project is undergoing big reorganization. The new code is in rework branch and once this issue is closed it will be merged with master. What will be new:

  • More logical structure
  • Incorporating new much larger datasets
  • Unifying naming and style of code
  • Removing de-precedent code
  • Dropping support of Czech accents recognition

This brings some breaking changes. I recommend moving to the new code because I will no longer fix the issues from the old versions.

Model retraining

With the new version, some old models may become incompatible. Also, the old models were trained only on a small dataset. This requires large retraining. I would appreciate any help with this task because I have only limited access to some computation clouds.

Dropping support of Czech accents

The Czech accents will be removed from the words. Keeping only some text files which allow recovery of them. This solves some compatibility issues with different OS. Also, models trained on this dataset weren't very accurate.
However, as a school project, I will be creating software which automatically adds Czech accents to sentences. This is an only partial solution of the problem, but I don't have enough data for successful recognition of them anyways.

@Breta01
Copy link
Owner Author

Breta01 commented Nov 26, 2018

Some updates:

  • I updated the ocr package
  • I am finishing the dataset section with all the scripts. It should be big step up for the project, so please let me know if it works.
  • I will continue with rework of the notebooks

@Breta01 Breta01 added this to Report in Reorganization Nov 28, 2018
@Breta01
Copy link
Owner Author

Breta01 commented Apr 1, 2020

  • I will try to follow this guide for updating the project: https://guide.esciencecenter.nl/
  • I will also try to automate as many task as possible.
  • Update for TensorFlow 2.0
  • Follow code style Black

@Breta01 Breta01 mentioned this issue Apr 2, 2020
@Breta01
Copy link
Owner Author

Breta01 commented Apr 8, 2020

Ideas for better propagation https://guide.esciencecenter.nl/best_practices/communication.html

  • Web page
  • Docker image
  • online demo
  • screencast

I am also thinking about adding tests and setting up some continuous integration like travis CI

@SRK-returns
Copy link

Hi, I'm having trouble understanding the readme files. Any Youtube video that can explain how to get the datasets and creating the envs. Most of the packages are unavailable for installation.

@Breta01
Copy link
Owner Author

Breta01 commented Apr 27, 2021

Hi @SRK-returns,

which branch do you use? The update or master branch? I don't have any video instructions. It also depends on your OS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Reorganization
  
Report
Development

No branches or pull requests

2 participants