Aspose.OCR for Cloud
Aspose.OCR for Cloud is a REST API for optical character recognition and documents scanning in the cloud. It supports reading and recognizing characters from most commonly used raster image formats. Just pass specific image name and its format to the Aspose.OCR for Cloud REST API and it will return response in JSON format including recognized text.
Aspose.OCR for Cloud helps quickly add OCR functionality to your application.
It is easy to get started with Aspose.OCR for Cloud and there is nothing to install. Simply create an account at Aspose for Cloud and get your application information, then you are ready to use owr SDK or REST API with any language - on any platform.
How to use it in Your project
Aspose.OCR for Cloud is implemented as a REST API.
Our API is completely independent of your operating system, database system or development language. You can use any language and platform that supports HTTP to interact with our API. However, manually writing client code can be difficult, error-prone and time-consuming. Therefore, we have provided and support API SDKs in many development languages in order to make it easier to integrate with us. If you use SDK, it hides the REST API calls and lets you use Aspose.OCR features in a native way for your preferred language.
// Upload file using storage API OcrApi ocrApi = new OcrApi(APIKEY, APPSID, BASEPATH); IOcrV2Api ocrApiV2 = new OcrV2Api(APIKEY, APPSID, BASEPATH); StorageApi storageApi = new StorageApi(APIKEY, APPSID, BASEPATH); ResponseMessage putCreateResponse = storageApi.PutCreate(name, null, null, System.IO.File.ReadAllBytes(Path.Combine(DataFolder, name))); // Recognize using OCR Cloud API OCRResponse ocrResponseV2 = ocrApiV2.OcrV2GetRecognizeDocument(name, null, null); Console.WriteLine(ocrResponseV2.Status) Console.WriteLine(ocrResponseV2.Text)
The fastest way to learn our REST API is to try a request builder on API Reference page.
Versions and APIs
Currently we have two versions of Cloud OCR: V2 and V3: The first is based on the classic OCR algorithms and allows to perform the entire OCR process; The second is based on neural networks, it has limited functionality, experimental algorithms, but is very promising, we will develop this version.
The APIs of these versions are the same, differ only in the server address, so you can use V2 now and swith to V3 in future.
V2 - 17.10
This vesion uses our latest stable developments. It's produces a good result on almost all types of images for an acceptable execution time. It's based on classical OCR algorithms, not very fast and high-quality, but its advatage is that it can do document structure recognition - prosees document pages with images, tables and other content and automatically detect text regions to recognize.
V3 - 18.09
This is the last version we just released, but not yet perfect. It's based on the cutting edge neural networks algorithms, distributed computing and scalable cloud architecture. The key features are: DSR, Skew correction and fast text recognition. It also provides the good results on only-text images, up to 97,7% [1 - (Levenshtein / Text_Length)].
1. Get API keys if you haven't
This step takes about 2 minutes and allows you to recognize 30mb of any image files.
2. Upload image file on storage
On the same page go to the My Files tab and upload files you want to recognize.
3. Check the API
Use API Reference page to check your keys and API usage.
Currently only .NET SDK is available. Packages for other most used programming languages will be avialable soon. We are working on them right now.
We provided our new version based on the most advanced algorithms on neural networks, distributed computing.
Advantages of this algorithm:
- It's faster than V2, mean page processing time is less than 3 seconds
- Excellent recognition results. Mean recognition quality is greater than 97% on text-only samples
- It is based on modern, perspective, scalable cloud architecture
- We've released our skew correction module that allows to recognize slightly rotated images
- We've improved text recognition module to fix a lot of issues in our roadmap
- We've integrated Tensorflow-Serving technology into our pipeline
We've included our current groundwork on DSR feature with the following classes: paragraph, table, image, formula, header, caption and list. Also we've prepared text lines recognition module that replaces words splitter and recognition modules. OCR Engine works slightly better and faster due to those improvements.
In the upcoming releases, we are set to implement a number of new features and fix issues:
- Build SDK packages for Java, PHP, Ruby, Python, Node.js, Android, Objective-C, Perl
- Speed up single page recognition up to less than 2 seconds
- Add a feature of batch files processing
- Add a feature to select custom regions of image to recognize
- Add a feature of document structure processing to take a possibility of any document processing (with images, tables, etc. on page)
- Add more output formats: hocr and pdf
- Fix issue of collapsing words
- Fix issue of duplicate letters
- Fix issues with punctuation symbols
- Learn to recognize characters: hyphen-minus (-), dash (–), grave accent (`), underscore (_), slashes (/)(\)