Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadMtx failed on Windows #5687

Closed
hongduosun opened this issue Mar 3, 2022 · 13 comments
Closed

ReadMtx failed on Windows #5687

hongduosun opened this issue Mar 3, 2022 · 13 comments
Labels
bug Something isn't working

Comments

@hongduosun
Copy link

The error is similary to #5362

Calling ReadMtx funtion to load local files failed on Windows with the following message:
Error in url(description = uri) : URL scheme unsupported by this method

It failed by either specifying the absolute paths or changing the working directory, but it works fine on Ubuntu 18.04 .

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936
[2] LC_CTYPE=Chinese (Simplified)_China.936
[3] LC_MONETARY=Chinese (Simplified)_China.936
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.936

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] httr_1.4.2 SeuratObject_4.0.4 Seurat_4.1.0

loaded via a namespace (and not attached):
[1] nlme_3.1-155 matrixStats_0.61.0 spatstat.sparse_2.1-0
[4] RcppAnnoy_0.0.19 RColorBrewer_1.1-2 sctransform_0.3.3
[7] tools_4.1.2 utf8_1.2.2 R6_2.5.1
[10] irlba_2.3.5 rpart_4.1.16 KernSmooth_2.23-20
[13] uwot_0.1.11 mgcv_1.8-39 lazyeval_0.2.2
[16] colorspace_2.0-3 tidyselect_1.1.2 gridExtra_2.3
[19] curl_4.3.2 compiler_4.1.2 cli_3.2.0
[22] plotly_4.10.0 scales_1.1.1 lmtest_0.9-39
[25] spatstat.data_2.1-2 ggridges_0.5.3 pbapply_1.5-0
[28] goftest_1.2-3 stringr_1.4.0 digest_0.6.29
[31] spatstat.utils_2.3-0 pkgconfig_2.0.3 htmltools_0.5.2
[34] parallelly_1.30.0 fastmap_1.1.0 htmlwidgets_1.5.4
[37] rlang_1.0.1 shiny_1.7.1 generics_0.1.2
[40] zoo_1.8-9 jsonlite_1.8.0 ica_1.0-2
[43] spatstat.random_2.1-0 dplyr_1.0.8 magrittr_2.0.2
[46] patchwork_1.1.1 Matrix_1.4-0 Rcpp_1.0.8
[49] munsell_0.5.0 fansi_1.0.2 abind_1.4-5
[52] reticulate_1.24 lifecycle_1.0.1 stringi_1.7.6
[55] MASS_7.3-55 Rtsne_0.15 plyr_1.8.6
[58] grid_4.1.2 parallel_4.1.2 listenv_0.8.0
[61] promises_1.2.0.1 ggrepel_0.9.1 crayon_1.5.0
[64] miniUI_0.1.1.1 deldir_1.0-6 lattice_0.20-45
[67] cowplot_1.1.1 splines_4.1.2 tensor_1.5
[70] pillar_1.7.0 igraph_1.2.11 spatstat.geom_2.3-2
[73] future.apply_1.8.1 reshape2_1.4.4 codetools_0.2-18
[76] leiden_0.3.9 glue_1.6.2 data.table_1.14.2
[79] png_0.1-7 vctrs_0.3.8 httpuv_1.6.5
[82] gtable_0.3.0 RANN_2.6.1 purrr_0.3.4
[85] spatstat.core_2.4-0 polyclip_1.10-0 tidyr_1.2.0
[88] scattermore_0.8 future_1.24.0 ggplot2_3.3.5
[91] mime_0.12 xtable_1.8-4 later_1.3.0
[94] survival_3.3-0 viridisLite_0.4.0 tibble_3.1.6
[97] cluster_2.1.2 globals_0.14.0 fitdistrplus_1.1-6
[100] ellipsis_0.3.2 ROCR_1.0-11

@hongduosun hongduosun added the bug Something isn't working label Mar 3, 2022
@hongduosun
Copy link
Author

Hi @saketkc , could you please help to check this issue?

@saketkc
Copy link
Collaborator

saketkc commented Mar 3, 2022

Thanks for posting the session info. Can you also post your relevant code here?

@hongduosun
Copy link
Author

Yes, sure.

library(Seurat)
data <- ReadMtx("G:\\demo_UMI_counts.mtx.gz", "G:\\demo_cells.tsv.gz", "G:\\demo_features.tsv.gz")

or:

setwd("G:\\")
library(Seurat)
data <- ReadMtx("demo_UMI_counts.mtx.gz", "demo_cells.tsv.gz", "demo_features.tsv.gz")

I have tested on macOS 15.2.1, it also works well. So I guess this problem is Windows specific.

@saketkc
Copy link
Collaborator

saketkc commented Mar 3, 2022

Thanks! I will need to find a Windows machine to debug.

@daduncan0302
Copy link

I would like to mention I am having a similar problem with ReadMTX on a windows build using the following repository:

https://cf.10xgenomics.com/samples/cell-exp/6.1.0/20k_PBMC_3p_HT_nextgem_Chromium_X/20k_PBMC_3p_HT_nextgem_Chromium_X_raw_feature_bc_matrix.tar.gz

Putting the extracted data from the .tar into the working directory the command:

mtx_obj <- ReadMtx(mtx = "matrix.mtx.gz", features = "features.tsv.gz", cells = "barcodes.tsv.gz")

Also raised the error: Error in url(description = uri) : URL scheme unsupported by this method

@JuliaGnatek
Copy link

JuliaGnatek commented Mar 8, 2022

@haydensun @saketkc Windows user here, I did some basic debugging and the problem seems to lie in the first if statement of the ReadMtx function. uri is derived with normalizePath, which in Windows returns path with backslashes.

The first if statement searches for forward slashes only, so on Windows program goes straight to the else part and returns the error. Accounting for backslashes helped me run this function locally without any issues.

I only changed the condition in the first if statement, from
(grepl(pattern = '^:///', x = uri)) to
(grepl(pattern = '^:///', x = uri) | grepl(pattern = ':\\\\', x = uri))

@saketkc
Copy link
Collaborator

saketkc commented Mar 8, 2022

Thanks @JuliaGnatek, that's very helpful and makes sense given the failures happen even if the files are in current working directory. I will fix it and update it here.

@mircomacchi
Copy link

Hello, had the same issue on Windows. I managed to resolve this problem and finally load the matrix with ReadMtx by installingR version 4.0.5 (2021-03-31). I suppose this is a problem with R versions >= 4.1
Hope this helps,
Mirco

@saketkc
Copy link
Collaborator

saketkc commented Mar 13, 2022

Sorry about the delay. Can you confirm if the following function fixes it:

library(Matrix)
library(utils)
library(httr)
library(tools)

ReadMtx <- function(
  mtx,
  cells,
  features,
  cell.column = 1,
  feature.column = 2,
  cell.sep = "\t",
  feature.sep = "\t",
  skip.cell = 0,
  skip.feature = 0,
  mtx.transpose = FALSE,
  unique.features = TRUE,
  strip.suffix = FALSE
) {
  all.files <- list(
    "expression matrix" = mtx,
    "barcode list" = cells,
    "feature list" = features
  )
  for (i in seq_along(along.with = all.files)) {
    uri <- normalizePath(all.files[[i]], mustWork = FALSE)
    err <- paste("Cannot find", names(x = all.files)[i], "at", uri)
    uri <- build_url(url = parse_url(url = uri))
    if (grepl(pattern = '^:///', x = uri) | grepl(pattern = ':\\\\', x = uri)) {
      uri <- gsub(pattern = '^://', replacement = '', x = uri)
      uri <- gsub(pattern = '^:\\\\', replacement = '', x = uri)
      if (!file.exists(uri)) {
        stop(err, call. = FALSE)
      }
    } else {
      if (!Online(url = uri, seconds = 2L)) {
        stop(err, call. = FALSE)
      }
      if (file_ext(uri) == 'gz') {
        con <- url(description = uri)
        uri <- gzcon(con = con, text = TRUE)
      }
    }
    all.files[[i]] <- uri
  }
  cell.barcodes <- read.table(
    file = all.files[['barcode list']],
    header = FALSE,
    sep = cell.sep,
    row.names = NULL,
    skip = skip.cell
  )
  feature.names <- read.table(
    file = all.files[['feature list']],
    header = FALSE,
    sep = feature.sep,
    row.names = NULL,
    skip = skip.feature
  )
  # read barcodes
  bcols <- ncol(x = cell.barcodes)
  if (bcols < cell.column) {
    stop(
      "cell.column was set to ",
      cell.column,
      " but ",
      cells,
      " only has ",
      bcols,
      " columns.",
      " Try setting the cell.column argument to a value <= to ",
      bcols,
      "."
    )
  }
  cell.names <- cell.barcodes[, cell.column]
  if (all(grepl(pattern = "\\-1$", x = cell.names)) & strip.suffix) {
    cell.names <- as.vector(x = as.character(x = sapply(
      X = cell.names,
      FUN = ExtractField,
      field = 1,
      delim = "-"
    )))
  }
  # read features
  fcols <- ncol(x = feature.names)
  if (fcols < feature.column) {
    stop(
      "feature.column was set to ",
      feature.column,
      " but ",
      features,
      " only has ",
      fcols, " column(s).",
      " Try setting the feature.column argument to a value <= to ",
      fcols,
      "."
    )
  }
  if (any(is.na(x = feature.names[, feature.column]))) {
    na.features <- which(x = is.na(x = feature.names[, feature.column]))
    replacement.column <- ifelse(test = feature.column == 2, yes = 1, no = 2)
    if (replacement.column > fcols) {
      stop(
        "Some features names are NA in column ",
        feature.column,
        ". Try specifiying a different column.",
        call. = FALSE
        )
    } else {
      warning(
        "Some features names are NA in column ",
        feature.column,
        ". Replacing NA names with ID from column ",
        replacement.column,
        ".",
        call. = FALSE
        )
    }
    feature.names[na.features, feature.column] <- feature.names[na.features, replacement.column]
  }
  feature.names <- feature.names[, feature.column]
  if (unique.features) {
    feature.names <- make.unique(names = feature.names)
  }
  data <- readMM(file = all.files[['expression matrix']])
  if (mtx.transpose) {
    data <- t(x = data)
  }
  if (length(x = cell.names) != ncol(x = data)) {
    stop(
      "Matrix has ",
      ncol(data),
      " columns but found ", length(cell.names),
      " barcodes. ",
      ifelse(
        test = length(x = cell.names) > ncol(x = data),
        yes = "Try increasing `skip.cell`. ",
        no = ""
      ),
      call. = FALSE
      )
  }
  if (length(x = feature.names) != nrow(x = data)) {
    stop(
      "Matrix has ",
      nrow(data),
      " rows but found ", length(feature.names),
      " features. ",
      ifelse(
        test = length(x = feature.names) > nrow(x = data),
        yes = "Try increasing `skip.feature`. ",
        no = ""
      ),
      call. = FALSE
      )
  }

  colnames(x = data) <- cell.names
  rownames(x = data) <- feature.names
  data <- as(data, Class = "dgCMatrix")
  return(data)
}

@hongduosun
Copy link
Author

hongduosun commented Mar 15, 2022

Hi @saketkc , I'm afraid the updated code above still not works because normalizePath function will convert relative path to absolute path on Windows. Please see the detailed debugging info below.

> setwd("G:\\")
> getwd()
[1] "G:/"

Since normalizePath gives same absolute path whenever relative or absolute path is given on Windows, subsequent build_url also gives same result:

> uri <- normalizePath("test.txt",mustWork = FALSE)
> uri
[1] "G:\\test.txt"
> uri <- build_url(url = parse_url(url = uri))
> uri
[1] "G:///\\test.txt"
> grepl(pattern = '^:///', x = uri) | grepl(pattern = ':\\\\', x = uri)
[1] FALSE

> uri <- normalizePath("G:\\test.txt",mustWork = FALSE)
> uri
[1] "G:\\test.txt"
> uri <- build_url(url = parse_url(url = uri))
> uri
[1] "G:///\\test.txt"
> grepl(pattern = '^:///', x = uri) | grepl(pattern = ':\\\\', x = uri)
[1] FALSE

So I think for local files the correct action is to check if ':///' exists and replace it with ':', and the code should be changed to:

if (grepl(pattern = '^:///', x = uri) | grepl(pattern = ':///', x = uri)) {
      uri <- gsub(pattern = '^://', replacement = '', x = uri)
      uri <- gsub(pattern = ':///', replacement = ':', x = uri)

And it's worth noting that normalizePath will not give expected path for remote url as it inserts current working directory before the url on Windows:

> normalizePath("https://github.com/satijalab/seurat/issues/5687", mustWork=FALSE)
[1] "G:\\https:\\github.com\\satijalab\\seurat\\issues\\5687"

@JuliaGnatek
Copy link

I definitely have to agree with @haydensun - it seems that I accidentally removed line where build_url function was used and later suggested a solution for function without this one line - I'm sorry for the inconvenience!

I can only suggest to simplify

if (grepl(pattern = '^:///', x = uri) | grepl(pattern = ':///', x = uri))

to

if (grepl(pattern = ':///', x = uri)) 

@saketkc
Copy link
Collaborator

saketkc commented Apr 1, 2022

This should now be fixed by 3e437fe. You can install the develop version using the instructions here

@saketkc saketkc closed this as completed Apr 1, 2022
@uqjlu8
Copy link

uqjlu8 commented Apr 13, 2022

I have the same problem when trying to use ReadMTX in RStudio on Windows. I keep getting the following error -> "Error: Cannot find expression matrix at /path/to/matrix" Tried installing the development version of Seurat (4.1.0.9003). Still get the same error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants