title | description | ms.topic | ms.date | ms.author | author |
---|---|---|---|---|---|
API ref for Text Recognition APIs in the Windows App SDK |
Learn about the new Artificial Intelligence (AI) text recognition APIs that will ship with the Windows App SDK and can be used to identify characters in an image, recognize words, lines, polygonal boundaries, and provide confidence levels for the generated matches. |
article |
06/21/2024 |
kbridge |
karl-bridge-microsoft |
Learn about the new Artificial Intelligence (AI) text recognition APIs that will ship with the Windows App SDK and can be used to identify characters in an image, recognize words, lines, polygonal boundaries, and provide confidence levels for the generated matches.
For more details, see Text Recognition in the Windows App SDK.
Important
The Windows App SDK experimental channel includes APIs and features in early stages of development. All APIs in the experimental channel are subject to extensive revisions and breaking changes and may be removed from subsequent releases at any time. They are not supported for use in production environments, and apps that use experimental features cannot be published to the Microsoft Store.
Provides APIs for machine learning models that analyze the textual content of images.
public struct BoundingBox
A polygon with 4 points used for the boundary of recognized words and lines of text.
The bottom left corner of the bounding box.
The bottom right corner of the bounding box.
The top left point of the bounding box.
The top right point of the bounding box.
When returned as a boundary for a word or line, the TopLeft, TopRight, BottomRight, and BottomLeft points are relative to the rotation and skew of the recognized text in the image. The following diagram shows the point layout for different text rotations where 0 is TopLeft, 1 is TopRight, 2 is BottomRight, and 3 is BottomLeft, all relative to the text.
:::image type="content" source="../images/bounding-box-examples.png" alt-text="Diagram of three bounding box examples showing how corner points are identified based on text rotation.":::
public enum DetectedLineStyle
Specifies the line styles that can be recognized.
The line of text is hand written.
The line of text is not hand written.
public enum OrientationDetectionOptions
Specifies the text orientations that can be recognized.
Orientation is not recognized.
Orientation is recognized.
public sealed class RecognizedLine
Represents a single line of recognized text.
public Microsoft.Windows.Vision.RecognizedLineStyle Style { get; }
Gets the recognized line style.
the recognized line style.
Includes whether the line of text was handwritten or not and the level of recognition confidence.
public string Text { get; }
Gets the text of the recognized line.
The text of the recognized line.
All words concatenated with spaces.
public Microsoft.Windows.Vision.RecognizedWord[] Words { get; }
The words in the recognized line.
The words in the recognized line.
public struct RecognizedLineStyle
Represents the style of the recognized line.
The confidence level of the line style recognition.
The line style name.
public sealed class RecognizedText
Represents the result of an image-to-text recognition operation.
public float ImageAngle { get; }
Gets the clockwise rotational angle of the recognized text in degrees.
The clockwise rotational angle of the recognized text in degrees.
public Microsoft.Windows.Vision.RecognizedLine[] Lines { get; }
Gets the collection of recognized lines.
The collection of recognized lines.
public sealed class RecognizedWord
Represents a single recognized word.
public Microsoft.Windows.Vision.BoundingBox BoundingBox { get; }
Gets the quadrilateral boundary of the recognized word.
The quadrilateral boundary of the recognized word. TopLeft is relative to the word's rotation.
public float Confidence { get; }
Gets how likely this word was recognized correctly.
Wow likely this word was recognized correctly. Value ranges from 0.0 to 1.0, inclusive.
public string Text { get; }
Gets the text of the recognized word.
The text of the recognized word.
public sealed class TextRecognizer : System.IDisposable
Recognizes words and lines, and their quadrilateral boundaries, in a source image.
Disposes of the object and associated resources.
Not implemented in C#.
public static Windows.Foundation.IAsyncOperation<Microsoft.Windows.Vision.TextRecognizer> CreateAsync ();
Asynchronously creates a new instance of the TextRecognizer class.
A new instance of the TextRecognizer class.
This will return an error if GetModelReadyStatus is not Ready.
public static bool IsAvailable ();
Retrieves whether the underlying language model is installed.
True if the underlying language model is installed. Otherwise, false.
public static Windows.Foundation.IAsyncOperationWithProgress<Microsoft.Windows.Management.Deployment.PackageDeploymentResult,
Microsoft.Windows.Management.Deployment.PackageDeploymentProgress> MakeAvailableAsync ();
Ensures the underlying language model is installed and available for use.
An asynchronous action with progress that returns a PackageDeploymentResult on completion.
Microsoft.Windows.Vision.TextRecognizer.RecognizeTextFromImage(Microsoft.Windows.Imaging.ImageBuffer, Microsoft.Windows.Vision.TextRecognizerOptions) method
public Microsoft.Windows.Vision.RecognizedText RecognizeTextFromImage (Microsoft.Windows.Imaging.ImageBuffer imageBuffer,
Microsoft.Windows.Vision.TextRecognizerOptions options);
Recognize text in the provided image.
An uncompressed bitmap.
Options for configuring the text recognition model for the TextRecognizer.
The recognized text.
Microsoft.Windows.Vision.TextRecognizer.RecognizeTextFromImageAsync(Microsoft.Windows.Imaging.ImageBuffer, Microsoft.Windows.Vision.TextRecognizerOptions) method
public Windows.Foundation.IAsyncOperation<Microsoft.Windows.Vision.RecognizedText> RecognizeTextFromImageAsync (Microsoft.Windows.Imaging.ImageBuffer imageBuffer,
Microsoft.Windows.Vision.TextRecognizerOptions options);
Asynchronously recognize text in the provided image.
An uncompressed bitmap.
Options for configuring the text recognition model for the TextRecognizer.
The recognized text.
public sealed class TextRecognizerOptions
Provides options to configure the text recognition model for a TextRecognizer.
public Windows.Graphics.SizeInt32 MaxAnalysisSize { get; set; }
Gets or sets the maximum image size.
The maximum image size. Default value is 1152 width and 768 height.
This size is a suggestion, and might not always be honored.
If the source image is larger than the maximum size, it will automatically be scaled down to the upper size limits.
public uint MaxLineCount { get; set; }
Gets or sets the maximum number of lines to return from the recognition operation.
The maximum number of lines to return from the recognition operation.
Defaults to MaxLineCountSupported. If specified, the maximum lines returned will be the lesser of this value and MaxLineCountSupported.
public Microsoft.Windows.Vision.OrientationDetectionOptions OrientationDetection { get; set; }
Gets or sets whether to detect the text orientation.
Whether to detect the text orientation. Default value is None.
public TextRecognizerOptions ();
Initializes a new instance of the TextRecognizerOptions class.
Provides APIs for machine learning models that manipulate images.
public sealed class ImageBuffer : System.IDisposable
Represents an uncompressed bitmap for efficient cross-process marshaling.
ImageBuffer can be used with AI model APIs such as TextRecognizer that require image data. Typical usage involves creating an ImageBuffer from an existing SoftwareBitmap.
public Windows.Storage.Streams.IBuffer Buffer { get; }
Gets the current image buffer.
The current image buffer.
public uint BufferLength { get; }
Gets the length of the image buffer.
The length of the image buffer.
Disposes of the object and associated resources.
Not implemented in C#.
public void CopyToBuffer (byte[] values);
Copies the current buffer into the provided target buffer.
Vector of bytes in the buffer.
Microsoft.Windows.Imaging.ImageBuffer.CreateBufferAttachedToBitmap(Windows.Graphics.Imaging.SoftwareBitmap) method
public static Microsoft.Windows.Imaging.ImageBuffer CreateBufferAttachedToBitmap (Windows.Graphics.Imaging.SoftwareBitmap softwareBitmap);
Create a new ImageBuffer from an existing SotftwareBitmap by getting an IMemoryBufferReference from the bitmap object.
The SotftwareBitmap to create the ImageBuffer from.
The ImageBuffer or null if it's an unsupported format.
The SoftwareBitmap is locked until the async operation completes and the new ImageBuffer is destroyed.
Microsoft.Windows.Imaging.ImageBuffer.CreateCopyFromBitmap(Windows.Graphics.Imaging.SoftwareBitmap) method
public static Microsoft.Windows.Imaging.ImageBuffer CreateCopyFromBitmap (Windows.Graphics.Imaging.SoftwareBitmap softwareBitmap);
Create a new ImageBuffer from an existing SotftwareBitmap by copying out the underlying bitmap data.
The SotftwareBitmap to create the ImageBuffer from.
The ImageBuffer or null if it's an unsupported format.
The SoftwareBitmap is locked until the async operation completes and the new ImageBuffer is destroyed.
public Windows.Graphics.Imaging.SoftwareBitmap CreateSoftwareBitmap ();
Create a new SoftwareBitmap of pixel type BGRA32 from the pixel data stored in an ImageBuffer.
The new SoftwareBitmap of pixel type BGRA32.
public uint Height { get; }
Gets the height of the image, in pixels.
The height of the image, in pixels.
Microsoft.Windows.Imaging.ImageBuffer.#ctor(Windows.Storage.Streams.IBuffer, Microsoft.Windows.Imaging.PixelFormat, System.UInt32, System.UInt32) constructor
public ImageBuffer (Windows.Storage.Streams.IBuffer buffer,
Microsoft.Windows.Imaging.PixelFormat pixelFormat, uint width, uint height);
Initializes a new instance of the ImageBuffer class.
The ImageBuffer.
The pixel format of the image.
The width of the image, in pixels.
The height of the image, in pixels.
public Microsoft.Windows.Imaging.PixelFormat PixelFormat { get; }
Gets the pixel format of the image.
The pixel format of the image.
public uint Width { get; }
Gets the width of the image, in pixels.
The width of the image, in pixels.
public enum PixelFormat
Specifies the types of binary layouts for the underlying bitmap data.
Binary format is undefined.
The binary format is 24 bits per pixel; 8 bits each are used for the red, green, and blue components.
The binary format 32 bits per pixel; 8 bits each are used for the alpha, red, green, and blue components.
The binary format is 32 bits per pixel; 8 bits each are used for the red, green, blue, and alpha components. The color components are stored in red, green, blue, and alpha order.
The binary format is 32 bits per pixel; 8 bits each are used for the blue, green, red, and alpha components. The color components are stored in blue, green, red, and alpha order.
The binary format is 16 bits per pixel. The color information specifies 65536 shades of gray.