This code tries to make automate the whois request for a .gr domain, to https://grweb.ics.forth.gr/public/whois.jsp?lang=en by cracking the given captcha.
Forth.gr manages official .gr registry, so its the only reliable place to get information about .gr domains. International whois services have limited results (for example http://whois.domaintools.com/in.gr)
The way that does that is to clean up the image using native java and some simple processing, and then to use google tesseract OCR library, to crack the cleaned up image.
This has about 40% success. The program retries until the request succeeds. On successful run the whois information is printed in the console.
For a detailed outline of the method used see the blogpost here: http://nikos.glikis.net/hacks/the-forthpwn/ or see the comments in Main.java.
The repository is ~160MB. This is because I include the libraries used. I have not used maven, so that the deployment can be easier by users that don't know maven.
Also, the whole Intellij project is uploaded, including project dir .idea, so that is it easier to run and edit. Just download Intellij (free from https://www.jetbrains.com/idea/download/) and open the project folder. Then just click RUN.
If you want to build and run manually commands are included below (precompiles classes work on jdk 8):
Linux:
javac -cp .:lib/lib/commons-io-2.4.jar:lib/jna-3.3.0.jar:lib/jna-3.3.0-platform.jar:lib/tess4j/dist/tess4j-3.0.jar:lib/jsoup-1.8.3.jar:out/production/captchalib/lib/ghost4j-1.0.0.jar:lib/lib/hamcrest-core-1.3.jar:lib/lib/itext-2.1.7.jar:lib/lib/jai_imageio.jar:lib/lib/jna.jar:lib/lib/jul-to-slf4j-1.7.13.jar:lib/lib/junit-4.12.jar:lib/lib/lept4j-1.0.1.jar:lib/lib/log4j-1.2.17.jar:lib/lib/logback-classic-1.1.3.jar:lib/lib/logback-core-1.1.3.jar:lib/lib/rococoa-core-0.5.jar:lib/lib/slf4j-api-1.7.13.jar:lib/lib/jna.jar:lib/lib/xmlgraphics-commons-1.5.jar -d out/production/captcha src/com/tools/forth/*.java
Windows:
javac -cp .;lib/lib/commons-io-2.4.jar;lib/jna-3.3.0.jar;lib/jna-3.3.0-platform.jar;lib/tess4j/dist/tess4j-3.0.jar;lib/jsoup-1.8.3.jar;out/production/captchalib/lib/ghost4j-1.0.0.jar;lib/lib/hamcrest-core-1.3.jar;lib/lib/itext-2.1.7.jar;lib/lib/jai_imageio.jar;lib/lib/jna.jar;lib/lib/jul-to-slf4j-1.7.13.jar;lib/lib/junit-4.12.jar;lib/lib/lept4j-1.0.1.jar;lib/lib/log4j-1.2.17.jar;lib/lib/logback-classic-1.1.3.jar;lib/lib/logback-core-1.1.3.jar;lib/lib/rococoa-core-0.5.jar;lib/lib/slf4j-api-1.7.13.jar;lib/lib/jna.jar;lib/lib/xmlgraphics-commons-1.5.jar -d out/production/captcha src/com/tools/forth/*.java
The first argument is the domain you want to lookup.
Again, tested only on jdk 8. In Ubuntu you need to run
apt-get install tesseract-ocr
Windows dlls are included.
Linux:
java -cp .:lib/lib/commons-io-2.4.jar:lib/jna-3.3.0.jar:lib/jna-3.3.0-platform.jar:lib/tess4j/dist/tess4j-3.0.jar:lib/jsoup-1.8.3.jar:out/production/captchalib/lib/ghost4j-1.0.0.jar:lib/lib/hamcrest-core-1.3.jar:lib/lib/itext-2.1.7.jar:lib/lib/jai_imageio.jar:lib/lib/jna.jar:lib/lib/jul-to-slf4j-1.7.13.jar:lib/lib/junit-4.12.jar:lib/lib/lept4j-1.0.1.jar:lib/lib/log4j-1.2.17.jar:lib/lib/logback-classic-1.1.3.jar:lib/lib/logback-core-1.1.3.jar:lib/lib/rococoa-core-0.5.jar:lib/lib/slf4j-api-1.7.13.jar:lib/lib/jna.jar:lib/lib/xmlgraphics-commons-1.5.jar:out/production/captcha com.tools.forth.Main enikos.gr
Windows
java -cp .;lib/lib/commons-io-2.4.jar;lib/jna-3.3.0.jar;lib/jna-3.3.0-platform.jar;lib/tess4j/dist/tess4j-3.0.jar;lib/jsoup-1.8.3.jar;out/production/captchalib/lib/ghost4j-1.0.0.jar;lib/lib/hamcrest-core-1.3.jar;lib/lib/itext-2.1.7.jar;lib/lib/jai_imageio.jar;lib/lib/jna.jar;lib/lib/jul-to-slf4j-1.7.13.jar;lib/lib/junit-4.12.jar;lib/lib/lept4j-1.0.1.jar;lib/lib/log4j-1.2.17.jar;lib/lib/logback-classic-1.1.3.jar;lib/lib/logback-core-1.1.3.jar;lib/lib/rococoa-core-0.5.jar;lib/lib/slf4j-api-1.7.13.jar;lib/lib/jna.jar;lib/lib/xmlgraphics-commons-1.5.jar;out/production/captcha com.tools.forth.Main enikos.gr