New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wiki getting spammed #2873
Comments
Please tell me what do you think about the following option: With the current policy, which started today, it seems that most people are blocked from contributing to the wiki (including me). |
What about using http://tesseract-ocr.github.io/ for that? The repository is https://github.com/tesseract-ocr/tesseract-ocr.github.io. Of course we should invest some work in the current HTML page, add a theme (which one?), but finally we could offer both source code and user documentation there. |
Actually wiki is git repo: try clone https://github.com/tesseract-ocr/tesseract.wiki.git I put you and @Shreeshrii as collaborators - can you try to edit wiki? |
Thanks, this will stop the spam posts but we need to choose the option which also allows others to easily contribute to documentation efforts, with moderation/approval. I do not know enough to weigh in on the pros and cons of alternatives proposed. |
What I found at the moment is that github allows to edit wiki in 2 ways:
There is no other possibility... |
Here is a test which shows how the documentation could be presented in the future: https://ub-mannheim.github.io/. The current Wiki including the full history could be moved to https://github.com/tesseract-ocr/tesseract-ocr.github.io, and contributors would simply send pull requests to add or update documentation. |
In addition, I would suggest keeping a single page of wiki, which points to the new locations of documentation. Does github.io need the pages to be in HTML or would markdown work? |
All pages are markdown (unmodified copy from Wiki). |
Questions: Should Api/examples be moved to within tesseract repo in examples directory? Should Tesseract 5 also be added as a category? |
Zdenko, thanks for the invitation. Yes, now I can edit the wiki. I still think we should let most users (but not the spammers) edit the docs just like we do with the code. Stefan's proposal looks good to me. |
I agree.
Is there any way to preview the available themes to choose? @stweil What would be the next steps to make this operational? |
You can use these instructions without really creating a new GitHub Page or read https://pages.github.com/themes/. See also the other documentation on GitHub Pages. |
|
@stweil Thank you! Does Tesseract have an official logo? Should we adopt the one being used on https://opensource.google/projects/tesseract ? The theme used can then be coordinated with it. @zdenop Your thoughts? |
https://github.com/tesseract-ocr/tesseract-ocr.github.io/tree/wiki includes the steps 1...4 from my list. @zdenop, if you agree I'd push that branch to master to make it operational. |
I think it would be better to separate the doxygen repo from the wiki repo. https://stackoverflow.com/questions/15563685/can-i-create-more-than-one-repository-for-github-pages Another thing, I don't see that the wiki's history is preserve in your site. |
I did not clone the wiki history for the UB-Mannheim site, but it is preserved in https://github.com/tesseract-ocr/tesseract-ocr.github.io/tree/wiki. |
All GitHub Pages content for Tesseract would always be under https://tesseract-ocr.github.io/. The related repository is https://github.com/tesseract-ocr/tesseract-ocr.github.io. Each other repository, for example https://github.com/tesseract-ocr/tessdata, can contribute GitHub Pages which would be visible under https://tesseract-ocr.github.io/tessdata. So separating the Doxygen generated API documentation would require a repository with that documentation, for example https://github.com/tesseract-ocr/api. Example: https://ub-mannheim.github.io/ with https://ub-mannheim.github.io/PalMA/. |
Index page can also link to https://github.com/tesseract-ocr/docs |
@stweil : thanks - go ahead. I do not have free time at part of year, so I appreciate any support... |
I agree. I think the doxygen repo can be pretty large, keeping info for three releases. It would be better if wiki documentation can be separate from this. |
My suggestions: Move wiki content to https://github.com/tesseract-ocr/guide Move doxygen content to https://github.com/tesseract-ocr/doxygen https://github.com/tesseract-ocr/tesseract-ocr.github.io/ should only contain links for the other reoos. I'm not sure if this is necessary, but instead of one doxygen repo, you can have one 'doxygen-x.xx' repo for each release. |
That's correct. Each version needs about 140 MB, and I think there should be the versions 3.x (currently 3.05), 4.x (currently 4.1) and latest (Git master). Here are the current sizes:
https://github.com/tesseract-ocr/doxygen might be too specific, as the software could change its name or be replaced by a different one in the future. I suggest https://github.com/tesseract-ocr/tessapi. Then forks still have the |
Forks would then also be named |
|
Yes, you're right. One alternative is clang-doc
Ok.
I hope that no one who forked this repo will complain.
You're right.
Choose one :) |
Here is the new status:
|
Please note that repo changes need some time (typically a few minutes) until they are visible on GitHub Pages. |
Thanks @stweil. Looking good. Will you be deleting the wiki pages and announcing this in the forum/mailing lists now? |
Thanks Stefan. Nice work. |
About the 'old' wiki. There are links from the forum, blogs and other sites to the wiki. I suggest to keep only the top header in each page, and use the global
If you want, after you do these changes, you can remove the history and keep just the last commit. |
Another option is to just keep the 'Home' page, delete its content and add the message. I think my previous suggestion is better. |
There is now a footer which marks the wiki pages as unmaintained and links to the new URL. |
Is it possible to put an auto redirect from the wiki pages to the new pages? |
I saw your top warning in the 'Home' page, but didn't notices the footer. I created this page: https://github.com/tesseract-ocr/tesseract/wiki/_Header.md Please delete it. I don't see an option to delete it. Sorry for the spamming :) |
All links from README in https://github.com/tesseract-ocr/tessdoc are broken. Edit:
@stweil This should work for you locally also. Please check. |
There are also still many absolute links and texts which refer to the Wiki. |
Please see nzbget/nzbget#383 |
How about removing the wiki content on each page and replacing with link to corresponding github pages, something similar to https://github.com/netdata/netdata/wiki/a-github-star-is-important At a later date we can just have a home page in wiki and point to the new github pages for documentation. |
https://github.com/tcort/markdown-link-check https://github.com/dkhamsing/awesome_bot |
I have added the MOVE notice as a custom sidebar to the wiki pages. It will be more noticeable than the footer and will be available on every wiki page. |
I have replaced the content in all wiki pages (except Home and ReadMe) with a MOVE message and added a link to corresponding tessdoc page. I have also created a PR with FAQ.md (converted from asciidoc version). If there are no more pending items regarding wiki this issue can be closed. We can open a new issue regarding reorg of pages in tessdoc repo. |
@stweil Suggest renaming this as |
Currently, the Shouldn't it just list the headers files in the In the future we may also include the files in the (not yet exist) |
Could we use a style that that does not use frames? |
The tessapi repo is too big. What about my earlier suggestion to split it to several repos?
|
Ideally generated documentation should not be part of the tessapi repository at all. There is no need to keep the history for generated data, and only the latest documentation for the different branches or releases is relevant. If that documentation could be stored on storage with a web interface, that would be sufficient. Is there such storage available? The generated documentation not only includes the API but is also a documentation of the full source code. I think that is fine because both parts are needed, but maybe the name |
Most devs need just the API docs, not the full docs, but examples are also needed to demonstrate how to use the API. The full code docs is needed only for people developing Tesseract itself.
I think its ok to do a forced push to that repo. With this policy, you can solve the history issue. |
Sure, that would be possible. But why? When I shrink the menu on the left side with the mouse, it looks like the style without frames. |
Currently it documents 3.05.02, 3.x, 4.0.0 and latest. I think one of 3.05.02 or 3.x could be removed which would reduce the size by about 25 %.
Should we document branches (3.05, 4.1) or tagged versions (3.05.02, 4.1.1) for older code? Maybe the unmaintained branches / versions don't need code documentation online, so removing one more would be possible. I suggest to keep only latest and 4.x (currently based on the 4.1 branch). |
In one word: Accessibility. Try to zoom-in a page (to 200%) when browsing that repo and you should see the problem. |
The relevant option in Doxyfile is |
Please see https://github.com/tesseract-ocr/tesseract/wiki/4.0-with-LSTM/_history
The content on 4.0-with-LSTM page has been deleted.
There should be some control or moderation of changes to wiki.
I would also suggest using https://github.com/apps/stale bot to prune out inactive\old issues, though number of days of inactivity can be larger than default of 60.
The text was updated successfully, but these errors were encountered: