Skip to content
This repository has been archived by the owner on Oct 31, 2019. It is now read-only.

Read .gz file from Data Lake #115

Open
MartheUT opened this issue May 8, 2018 · 1 comment
Open

Read .gz file from Data Lake #115

MartheUT opened this issue May 8, 2018 · 1 comment

Comments

@MartheUT
Copy link

MartheUT commented May 8, 2018

There is a need to read .gz files from the data lake. Adding gunzip to the azureDataLakeRead function will not work because you can't unzip a response only a file.

@MartheUT
Copy link
Author

MartheUT commented May 8, 2018

Probably not the most elegant solution, but it works:

azureDataLakeReadCSVGZ<- function (azureActiveContext, azureDataLakeAccount, relativePath, 
                                  offset, length, bufferSize, verbose = FALSE) 
{
  resHttp <- azureDataLakeReadCore(azureActiveContext, azureDataLakeAccount, 
                                   relativePath, seperator, offset, length, bufferSize, verbose)
  stopWithAzureError(resHttp)
  resRaw <- (content(resHttp, as="raw", type="gz", encoding = "UTF-8"))
  
  #Write a temporary file in binary mode from where you can unzip the data
  TempName<-tempfile(pattern = "", fileext = ".csv.gz")
  con <- file(TempName, "wb") 
  writeBin(resRaw, con)
  close(con)
  Data<-read.table(TempName, sep=seperator)
  return(Data)
}

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant