Skip to content

Sayan-dev731/document-analyzer

Repository files navigation

πŸš€ Azure Document Analyzer - Hands-On Lab

Welcome to the Azure Document Analyzer Lab! This is a comprehensive, step-by-step learning experience that will guide you through building a web-based application using Azure AI Form Recognizer to analyze documents and extract structured data.

πŸ“š Lab Navigation

Document Description Who It's For
OVERVIEW.md πŸ—ΊοΈ Visual learning path and lab structure First-time visitors
README.md ⭐ Main lab guide with step-by-step instructions Everyone - main content!
QUICKSTART.md 5-minute quick setup for experienced developers Experienced developers
EXERCISES.md Practice exercises and challenges After completing main lab
FAQ.md Frequently asked questions When you need help
GLOSSARY.md Terms and definitions Reference as needed
CONTRIBUTING.md How to improve this lab Contributors
LICENSE MIT License Legal info

πŸ’‘ Tip: New to this lab? Start with OVERVIEW.md to see the big picture, then come back here!


🎯 What You'll Learn

By the end of this lab, you will:

  • βœ… Create and configure an Azure Form Recognizer resource
  • βœ… Build a web application that analyzes PDFs and images
  • βœ… Extract key-value pairs, tables, paragraphs, and selection marks
  • βœ… Integrate Azure AI services into a JavaScript application
  • βœ… Deploy and test a document analysis solution

πŸ”„ CI/CD & GitHub Flow

This project uses GitHub Actions for continuous integration and deployment:

Automated Workflows

  • CI Workflow: Runs on every pull request

    • βœ… Installs dependencies
    • βœ… Runs ESLint for code quality
    • βœ… Builds the project with webpack
    • βœ… Verifies build output
  • Deploy Workflow: Automatically deploys to GitHub Pages when changes are merged to main

    • βœ… Builds the application
    • βœ… Deploys to GitHub Pages
    • βœ… Makes your app publicly accessible

GitHub Flow Process

  1. Create a feature branch from main
  2. Make your changes and commit them
  3. Open a pull request for review
  4. CI automatically validates your changes
  5. After review and approval, merge to main
  6. Deployment automatically occurs to GitHub Pages

Viewing the Live Site

Once deployed, your application will be available at:

https://Sayan-dev731.github.io/document-analyzer/

To enable GitHub Pages deployment:

  1. Go to repository Settings > Pages
  2. Set Source to "GitHub Actions"
  3. Workflows will automatically deploy on merge to main

πŸ“‹ Lab Overview

This lab is organized into progressive steps that build upon each other. Each step includes:

  • Clear instructions with explanations
  • Visual guides showing exactly what to do
  • Code explanations to help you understand how it works
  • Verification steps to ensure everything is working

🏁 Step 0: Prerequisites

Before you begin, make sure you have:

Required Tools

Required Accounts

Verify Your Setup

Open a terminal and run these commands to verify your installations:

node --version    # Should show v14.0.0 or higher
npm --version     # Should show 6.0.0 or higher
git --version     # Should show git version 2.0 or higher

βœ… Checkpoint: All commands should return version numbers without errors.


πŸ”§ Step 1: Create Azure Form Recognizer Resource

In this step, you'll create an Azure Form Recognizer resource that provides the AI capabilities for document analysis.

1.1 Sign in to Azure Portal

  1. Open your browser and go to https://portal.azure.com
  2. Sign in with your Azure account credentials

1.2 Create a New Resource

  1. Click "Create a resource" (the βž• icon in the top-left corner)
  2. In the search bar, type "Form Recognizer" or "Document Intelligence"
  3. Select "Form Recognizer" from the results
  4. Click "Create"

Create Azure Resource

1.3 Configure Your Resource

Fill in the following details:

Field Value Description
Subscription Select your subscription The Azure subscription to bill
Resource Group Create new: document-analyzer-rg Logical container for your resources
Region Choose closest to you Where your service will be hosted
Name my-document-analyzer Unique name for your service
Pricing Tier Free F0 or Standard S0 Free tier: 500 pages/month

Configure Resource

  1. Click "Review + Create"
  2. Review your settings
  3. Click "Create"

⏱️ Wait time: 1-2 minutes for deployment to complete

1.4 Get Your Credentials

Once deployment is complete:

  1. Click "Go to resource"
  2. In the left menu, click "Keys and Endpoint"
  3. You'll see:
    • KEY 1 and KEY 2 (either one works)
    • Endpoint URL

Keys and Endpoint

  1. Copy both values - you'll need them soon!
    • Click the πŸ“‹ copy icon next to KEY 1
    • Click the πŸ“‹ copy icon next to Endpoint

⚠️ Security Note: Keep these credentials secure. Never share them publicly or commit them to version control!

βœ… Checkpoint: You should have:

  • βœ… A Form Recognizer resource deployed in Azure
  • βœ… An API key copied
  • βœ… An endpoint URL copied

πŸ“¦ Step 2: Clone and Setup the Project

Now let's get the project code and install its dependencies.

2.1 Clone the Repository

Open your terminal and run:

# Navigate to your preferred directory
cd ~/Documents  # or wherever you keep your projects

# Clone the repository
git clone https://github.com/Sayan-dev731/document-analyzer.git

# Navigate into the project
cd document-analyzer

2.2 Explore the Project Structure

Take a moment to understand what you have:

document-analyzer/
β”œβ”€β”€ src/
β”‚   └── index.js           # Main application logic (Azure integration here!)
β”œβ”€β”€ images/                # Lab screenshots and visual guides
β”œβ”€β”€ dist/                  # Compiled JavaScript (generated by webpack)
β”œβ”€β”€ index.html             # Main HTML page
β”œβ”€β”€ styles.css             # Application styles
β”œβ”€β”€ image.png              # Logo
β”œβ”€β”€ package.json           # Node.js dependencies
β”œβ”€β”€ webpack.config.js      # Webpack bundler configuration
└── readme.md              # This lab guide!

2.3 Install Dependencies

Run the following command to install all required packages:

npm install

This installs:

  • @azure/ai-form-recognizer: Azure SDK for document analysis
  • webpack: Module bundler to compile your code
  • webpack-cli: Command-line interface for webpack

⏱️ Wait time: 30-60 seconds

βœ… Checkpoint: You should see:

  • βœ… A node_modules/ directory created
  • βœ… No error messages
  • βœ… Message like "added XXX packages"

πŸ” Step 3: Configure Your Azure Credentials

Now it's time to connect your application to your Azure Form Recognizer service.

3.1 Open the Code

  1. Open the project in your code editor:

    code .  # If using VS Code
  2. Navigate to src/index.js

3.2 Update Credentials

Find these lines near the top of src/index.js:

const key = "paste-your-api-key-here";
const endpoint = "paste-your-endpoint-url-here";

Replace them with YOUR credentials from Step 1.4:

const key = "YOUR_API_KEY_FROM_AZURE_PORTAL";
const endpoint = "YOUR_ENDPOINT_URL_FROM_AZURE_PORTAL";
// Example endpoint format: https://your-resource-name.cognitiveservices.azure.com/

3.3 Understanding the Code

Let's understand what this code does:

// Import Azure SDK components
import {
  AzureKeyCredential,
  DocumentAnalysisClient,
} from "@azure/ai-form-recognizer";

// Your credentials (you just updated these!)
const key = "YOUR_API_KEY_HERE";
const endpoint = "YOUR_ENDPOINT_URL_HERE";

// Main function that analyzes documents
async function analyzeDocument(file) {
  // 1. Create a client to communicate with Azure
  const client = new DocumentAnalysisClient(
    endpoint,
    new AzureKeyCredential(key)
  );
  
  // 2. Start the analysis using the pre-built document model
  const poller = await client.beginAnalyzeDocument("prebuilt-document", file);
  
  // 3. Wait for the analysis to complete
  const result = await poller.pollUntilDone();
  
  // 4. Return the results
  return result;
}

What does "prebuilt-document" mean?

  • Azure provides pre-trained models for common document types
  • No training required - it works out of the box!
  • Extracts: text, tables, key-value pairs, and more

3.4 Save Your Changes

⚠️ Important Security Note: In a real production application, you would:

  • Store credentials in environment variables
  • Use Azure Key Vault for secrets management
  • Never commit credentials to Git

For this lab, we're using direct credentials for simplicity.

βœ… Checkpoint:

  • βœ… Your API key and endpoint are updated in src/index.js
  • βœ… File is saved

πŸ—οΈ Step 4: Build the Application

Now let's compile the application so it can run in your browser.

4.1 Build with Webpack

You can now use npm scripts to build the application:

# Build for production
npm run build

# Run linter to check code quality
npm run lint

# Run linter and auto-fix issues
npm run lint:fix

Or use webpack directly:

npx webpack

What's happening?

  • Webpack reads your src/index.js file
  • It bundles all the code and dependencies together
  • It creates a single file: dist/bundle.js
  • The HTML file loads this bundle to run your app

You should see output like:

asset bundle.js 119 KiB [emitted] (name: main)
webpack 5.97.1 compiled successfully in 2905 ms

4.2 Understanding Webpack Configuration

Open webpack.config.js to see how it's configured:

import path from "path";
import { fileURLToPath } from "url";

const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);

export default {
  entry: "./src/index.js",        // Start from this file
  output: {
    filename: "bundle.js",         // Create this output file
    path: path.resolve(__dirname, "dist"),  // In the dist folder
  },
  mode: "production",              // Optimize for production
};

βœ… Checkpoint:

  • βœ… Build completed successfully
  • βœ… dist/bundle.js file created
  • βœ… No error messages

πŸ§ͺ Step 5: Test the Application

Time to see your document analyzer in action!

5.1 Start the Application

You can run the application in two ways:

Option A: Simple File Open

# Open the file directly in your default browser
open index.html        # macOS
start index.html       # Windows
xdg-open index.html    # Linux

Option B: Local Web Server (Recommended)

# Install http-server globally (one-time setup)
npm install -g http-server

# Start the server
http-server

# Open browser to: http://localhost:8080

5.2 Use the Application

You should see a clean, modern interface:

Application Interface

  1. Click "Browse" to select a file

    • Supported formats: PDF, JPEG, PNG, TIFF
    • Maximum file size: 50 MB
  2. Select a test document

    • Use any PDF or image with text
    • Invoices, receipts, and forms work great!
    • If you don't have one, you can download sample forms from Azure samples
  3. Click "Analyze Document"

  4. Wait for results (usually 2-10 seconds depending on document size)

5.3 Understand the Results

The application will display extracted information in a table:

Results Display

You'll see:

Data Type What It Shows Example
Key-Value Pairs Labeled fields and their values "Invoice Number: INV-12345"
Paragraphs Text blocks identified in the document Full sentences and paragraphs
Tables Structured data in rows/columns Price lists, line items
Selection Marks Checkboxes and their states β˜‘ Selected or ☐ Unselected
Confidence AI's certainty (0.0 to 1.0) 0.98 = 98% confident

5.4 Troubleshooting

Problem: "Error: Invalid credentials"

  • βœ… Double-check your API key and endpoint in src/index.js
  • βœ… Ensure there are no extra spaces
  • βœ… Rebuild: npx webpack

Problem: "Error: No file selected"

  • βœ… Make sure you clicked "Browse" and selected a file

Problem: "The bundle.js file is not loading"

  • βœ… Check that dist/bundle.js exists
  • βœ… Run npx webpack again
  • βœ… Hard refresh your browser (Ctrl+F5 or Cmd+Shift+R)

Problem: "CORS error"

  • βœ… Use a local web server (Option B above)
  • βœ… Don't open the HTML file directly from the file system

βœ… Checkpoint:

  • βœ… Application opens in browser
  • βœ… You can upload a file
  • βœ… Document analysis completes successfully
  • βœ… Results are displayed in the table

🧠 Step 6: Understand the Code

Now that everything works, let's dive deeper into how the application functions.

6.1 The HTML Structure (index.html)

<!-- File input for uploading documents -->
<input type="file" id="fileInput" accept=".pdf" />

<!-- Button to trigger analysis -->
<button id="analyzeButton">Analyze Document</button>

<!-- Results table (hidden until analysis completes) -->
<table id="resultsTable">
  <tbody id="resultsBody"></tbody>
</table>

6.2 The Event Handler (src/index.js)

document.getElementById("analyzeButton").addEventListener("click", async () => {
  // 1. Get the selected file
  const fileInput = document.getElementById("fileInput");
  const file = fileInput.files[0];

  // 2. Validate file was selected
  if (!file) {
    alert("Please select a file!");
    return;
  }

  // 3. Show loading state
  resultsBody.innerHTML = "<tr><td colspan='3'>Loading...</td></tr>";

  try {
    // 4. Convert file to Blob for Azure SDK
    const fileData = new Blob([file], { type: file.type });

    // 5. Call Azure to analyze the document
    const analysisResult = await analyzeDocument(fileData);

    // 6. Display the results
    displayResults(analysisResult);
  } catch (error) {
    // 7. Handle any errors
    resultsBody.innerHTML = `<tr><td colspan='3'>Error: ${error.message}</td></tr>`;
  }
});

6.3 The Display Logic

The displayResults() function handles four types of extracted data:

1. Key-Value Pairs - Labeled information

result.keyValuePairs.forEach((kvp) => {
  // kvp.key.content: "Invoice Date"
  // kvp.value.content: "2024-01-15"
  // kvp.confidence: 0.98
});

2. Paragraphs - Text blocks

result.paragraphs.forEach((paragraph) => {
  // paragraph.content: "This is a paragraph..."
  // paragraph.confidence: 0.99
});

3. Tables - Structured data

result.tables.forEach((table) => {
  table.cells.forEach((cell) => {
    // cell.content: Cell text
    // cell.rowIndex: 0
    // cell.columnIndex: 1
  });
});

4. Selection Marks - Checkboxes

result.selectionMarks.forEach((mark) => {
  // mark.state: "selected" or "unselected"
  // mark.confidence: 0.95
});

6.4 How Azure Form Recognizer Works

Behind the scenes, here's what happens:

  1. Upload: Your document is sent to Azure (securely via HTTPS)
  2. OCR: Optical Character Recognition extracts all text
  3. Layout Analysis: AI understands document structure
  4. Entity Extraction: Identifies tables, key-value pairs, etc.
  5. Confidence Scoring: AI provides certainty levels
  6. Results: Structured JSON data is returned to your app

6.5 Extending the Application

Want to add more features? Here are some ideas:

🎨 Visual Enhancements:

  • Show bounding boxes on the original document
  • Highlight extracted regions
  • Add a document preview

πŸ“Š Data Export:

  • Export results to CSV
  • Generate a JSON download
  • Create a PDF report

πŸ” Advanced Features:

  • Use custom trained models for specific document types
  • Add language detection
  • Implement batch processing for multiple files

πŸ›‘οΈ Security Improvements:

  • Move credentials to environment variables
  • Add file size validation
  • Implement rate limiting

πŸŽ‰ Step 7: Completion and Next Steps

Congratulations! 🎊 You've successfully completed the Azure Document Analyzer Lab!

What You've Accomplished

βœ… Created an Azure Form Recognizer resource
βœ… Configured Azure credentials in your application
βœ… Built a document analysis web application
βœ… Tested document upload and analysis
βœ… Extracted key-value pairs, tables, and text
βœ… Understood how Azure AI services integrate with JavaScript

Verification Checklist

  • Your application runs without errors
  • You can upload and analyze documents
  • Results display correctly with confidence scores
  • You understand the code structure
  • You can explain how the Azure SDK works

Clean Up (Optional)

If you're done experimenting and want to avoid charges:

  1. Go to Azure Portal
  2. Navigate to your Resource Group: document-analyzer-rg
  3. Click "Delete resource group"
  4. Type the resource group name to confirm
  5. Click "Delete"

Note: The Free tier (F0) doesn't incur charges, so you can keep it running!

Next Steps - Keep Learning!

πŸ“š Learn More About Azure AI:

πŸš€ Enhance Your Project:

  • Add support for invoices and receipts with specific models
  • Create a backend API to keep credentials secure
  • Deploy to Azure Static Web Apps
  • Add user authentication

🀝 Share Your Experience:

  • Star this repository on GitHub ⭐
  • Share what you built on social media
  • Help others by contributing to this lab

Additional Resources


πŸ†˜ Troubleshooting Guide

Common Issues and Solutions

Issue: Build fails with "Module not found"

Solution: Delete node_modules and reinstall
rm -rf node_modules package-lock.json
npm install

Issue: "Access denied" or "401 Unauthorized"

Solution: 
1. Verify your API key is correct (no extra spaces)
2. Check that the endpoint URL is complete
3. Ensure your Azure resource is deployed and active

Issue: "Quota exceeded"

Solution: 
- Free tier: 500 pages/month limit
- Wait until next month or upgrade to Standard tier
- Check usage in Azure Portal under "Metrics"

Issue: Analysis takes very long or times out

Solution:
- Large PDFs (>10 MB) take longer
- Try a smaller document first
- Check your internet connection
- Verify Azure service status

Issue: No results or empty tables

Solution:
- Check the document quality (not too blurry)
- Ensure text is machine-printed (not handwritten) for best results
- Try a different document (invoice, receipt, form)

Getting Help

Need assistance? Here's how to get help:

  1. Check the Console: Open browser DevTools (F12) β†’ Console tab
  2. Review Error Messages: Copy the full error for better debugging
  3. Azure Status: Check Azure Status Page
  4. Create an Issue: GitHub Issues

πŸ”’ Security Best Practices

For Production Applications:

  1. Never commit credentials

    // ❌ DON'T: Hard-code credentials
    const key = "actual-key-value-here";
    
    // βœ… DO: Use environment variables
    const key = process.env.AZURE_KEY;
  2. Use Azure Key Vault for storing secrets

  3. Implement a backend to proxy Azure calls

    • Frontend calls your backend
    • Backend holds credentials securely
    • Backend calls Azure APIs
  4. Add authentication to your app

  5. Validate file uploads

    // Check file size
    if (file.size > 50 * 1024 * 1024) {
      alert("File too large!");
      return;
    }
    
    // Check file type
    const allowedTypes = ['application/pdf', 'image/jpeg', 'image/png'];
    if (!allowedTypes.includes(file.type)) {
      alert("Invalid file type!");
      return;
    }

πŸ“„ License

This project is licensed under the MIT License.


πŸ™ Acknowledgments

  • Built with Azure Form Recognizer
  • Powered by Azure AI services
  • Created as an educational lab for learning cloud AI integration

Happy Learning! πŸŽ“

If you found this lab helpful, please ⭐ star the repository and share it with others!

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors