Skip to content

Asutosh11/DocumentReader

Repository files navigation

API Android Arsenal

DocumentReader

This library reads word documents (.doc and .docx), txt and PDF files, and gives the output content of the document as a String.

If you have ever tried to read contents of a PDF or MS word document on Android, you know how painful it is. This library makes your work easy.


Dependency for build.gradle (Project level)

repositories {
  ...
  maven { url 'https://jitpack.io' }
}


Dependency for build.gradle (Module: app)

dependencies {
  ....
  implementation 'com.github.Asutosh11:DocumentReader:0.12'
  
  // NOTE: use this only if you get a multidex exception
  implementation "androidx.multidex:multidex:2.0.1"
}
// NOTE: use this only if you get an error like - More than one file was found with OS independent path
packagingOptions {
   exclude 'META-INF/DEPENDENCIES'
   exclude 'META-INF/INDEX.LIST'
   exclude 'META-INF/spring.handlers'
   exclude 'META-INF/spring.schemas'
   exclude 'META-INF/cxf/bus-extensions.txt'
}
// NOTE: use this only if you get a multidex exception
defaultConfig {
   ...
   multiDexEnabled true
}


How to use it?

// Read a pdf file from Uri
val docString : String = DocumentReaderUtil.readPdfFromUri(fileUri, applicationContext)
// Read a pdf file from File
val docString : String = DocumentReaderUtil.readPdfFromFile(file, applicationContext)
// read a doc file from Uri
val docString : String = DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
// read a doc file from File
val docString : String = DocumentReaderUtil.readWordDocFromFile(file, applicationContext)
// read a docx file from Uri
val docString : String = DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
// read a docx file from File
val docString : String = DocumentReaderUtil.readWordDocFromFile(file, applicationContext)
// read a txt file from Uri
val docString : String = DocumentReaderUtil.readTxtFromUri(fileUri, applicationContext)
/*
 Even if you don't know your file type, 
 this library detects the file mime type and gives you the content of the file as a String
*/
val docString : String = when (DocumentReaderUtil.getMimeType(fileUri, applicationContext)) {
        "text/plain" -> DocumentReaderUtil.readTxtFromUri(fileUri, applicationContext)
        "application/pdf" -> DocumentReaderUtil.readPdfFromUri(fileUri, applicationContext)
        "application/msword" -> DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
        "application/vnd.openxmlformats-officedocument.wordprocessingml.document" -> 
                                        DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
         else -> ""
	 }

Thanks

The Apache Tika project
Apache's PdfBox port by TomRoush

About

This library reads word documents (.doc and .docx), txt and PDF files, and gives the output content of the document as a String.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages