Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alaska? #340

Closed
cswingle opened this issue Apr 17, 2023 · 6 comments
Closed

Alaska? #340

cswingle opened this issue Apr 17, 2023 · 6 comments

Comments

@cswingle
Copy link

I just attempted to run the first few lines of the example from the U.S. Data and Package Basics vignette, but replacing the location with the intersection of the Chena and Tanana Rivers in Interior Alaska:

library(sf)
library(nhdplusTools) # 

start_point <- st_sfc(st_point(c(-147.911472, 64.796633)), crs = 4269)
start_comid <- discover_nhdplus_id(start_point)

and I get an error:

Error in: https://labs.waterdata.usgs.gov/api/nldi/linked-data/comid/position?f=json&coords=POINT%28-147.911472%2064.796633%29
Error in: https://labs.waterdata.usgs.gov/api/nldi/linked-data/comid/NULL/

The code works with c(89.362, 43.090), so I'm guessing this is a case where the dataset/API/R library isn't applicable to Alaska.

Suggestions? Thanks!

> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 11 (bullseye)

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.13.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_DK.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] nhdplusTools_0.6.2 sf_1.0-12          nvimcom_0.9-144   

loaded via a namespace (and not attached):
 [1] zip_2.3.0            Rcpp_1.0.10          pillar_1.9.0         compiler_4.2.2       class_7.3-20         tools_4.2.2          jsonlite_1.8.4       lifecycle_1.0.3      tibble_3.2.1        
[10] fstcore_0.9.14       pkgconfig_2.0.3      rlang_1.1.0          DBI_1.1.3            cli_3.6.1            curl_5.0.0           dataRetrieval_2.7.12 parallel_4.2.2       e1071_1.7-13        
[19] dplyr_1.1.1          httr_1.4.5           xml2_1.3.3           generics_0.1.3       vctrs_0.6.1          classInt_0.4-9       grid_4.2.2           tidyselect_1.2.0     glue_1.6.2          
[28] R6_2.5.1             fansi_1.0.4          pbapply_1.7-0        RANN_2.6.1           tidyr_1.3.0          purrr_1.0.1          magrittr_2.0.3       units_0.8-1          fst_0.9.8           
[37] utf8_1.2.3           KernSmooth_2.23-20   proxy_0.4-27
@dblodgett-usgs
Copy link
Collaborator

That's correct. Sorry for the trouble.

For Alaska applications, you will have to go to NHD data directly. Unfortunately, nhdplusTools support for Alaska (region 19) is pretty thin. I've recently been doing a little more work in region 19 (https://code.usgs.gov/wma/nhgf/mainstems/-/blob/master/reg19_prototype.R) but it's very early.

If you want to dig in, you could work with my new work over in hydroloom https://github.com/DOI-USGS/hydroloom and make some hay, but there are a few things that would require fairly specific knowledge of the NHD and some of the stuff in hydroloom.

What application are you trying to accomplish? Maybe I could use it as an excuse to advance parts of that work?

@cswingle
Copy link
Author

Thanks for the reply! I will take a look at hydroloom and see what I can do. I just looked at the region 19 code; it's nice to see clearly written tidyverse-style R.

We are working on a project that's concerned with vehicle pollution entering the Chena River (and upstream tributaries) so my initial task was getting line layers for the Chena River and all the tributaries that drain into it. After that layer is intersected with a road layer, one important parameter will be the distance the crossing is from the Chena River, so some understanding of the network and direction of flow will matter for calculating the path and distance to the Chena. We have done this sort of work using the NHD directly, but it isn't easy and your package looked like a much simpler workflow that also requires no non-reproducible ArcMap/ArcGIS Pro mouse work.

@dblodgett-usgs
Copy link
Collaborator

dblodgett-usgs commented Apr 19, 2023

Alright -- so a quick doodle testing out some of the stuff I've been working on was productive.

I would like to:

With those caveats, the below shows how to index your point to an NHD flowline and do an upstream / downstream navigation. There's no "upmain" or "downmain" withouth more work -- which is what the prototype I linked above does, but this is probably a good start?

image

UPDATE: 4/19/23 -- this depends on remotes::install_github("doi-usgs/nhdplusTools@hydroloom")

download_nhd() was added.

library(hydroloom)
library(dplyr)
options(timeout = 60000)
options(timeout = 60000)

dir <- nhdplusTools::download_nhd("../../../Data/ak", "1908", TRUE)

fs <- list.files(dir, pattern = ".*1908.*.gdb", full.names = TRUE)

fl <- sf::read_sf(fs, "NHDFlowline")

fl <- filter(fl, ftype != 566)

fl <- select(fl, id = permanent_identifier, gnis_id, gnis_name, reachcode, 
             flowdir, wbid = wbarea_permanent_identifier, ftype, fcode)

# measures will be incorrect but we can fix this later.
fl <- mutate(fl, aggregate_id_from_measure = 0, aggregate_id_to_measure = 100)

point <- sf::st_sfc(sf::st_point(c(-147.911472, 64.796633)), 
                              crs = 4269)

i <- hydroloom::index_points_to_lines(fl, point)

sub <- fl[fl$id == i$id,]

mapview::mapview(list(sub, point))

flow_table <- sf::read_sf(fs, "NHDFlow")
# need to get the flow table into hydroloom ids
flow_table <- select(flow_table, 
                     id = from_permanent_identifier, 
                     toid = to_permanent_identifier)

# LOOPS!!! 
flow_table <- flow_table |>
  filter(id %in% fl$id) |>
  mutate(toid = ifelse(!toid %in% id, "", toid)) |>
  mutate(r = 1:n())

remove <- left_join(flow_table, flow_table, 
               by = c("toid" = "id")) |>
  filter(id == toid.y) |>
  pull(r.x)

flow_table <- filter(flow_table, !r %in% remove) |>
  select(-r)

down <- navigate_network_dfs(flow_table, sub$id, direction = "down")
up <- navigate_network_dfs(flow_table, sub$id, direction = "up")

subdown <- fl[fl$id %in% unique(unlist(down)),]
subup <- fl[fl$id %in% unique(unlist(up)),]

mapview::mapview(list(subup, subdown, point))

```

@cswingle
Copy link
Author

Excellent! Thanks so much! I will try your code and see where it takes me tomorrow morning.

@cswingle
Copy link
Author

FYI, here's what I came up with as a first cut for the analysis I need to do:

library(tidyverse)
library(glue)
library(sf)
library(fs)
library(zip)
# devtools::install_github("DOI-USGS/hydroloom")
library(hydroloom)

options(timeout = 60000)
u <- glue(
  "https://prd-tnm.s3.amazonaws.com/",
  "StagedProducts/Hydrography/NHD/",
  "HU4/GDB/NHD_H_1908_HU4_GDB.zip"
)
d <- "./data/ak"
if (!dir_exists(d)) {
  dir_create(d)
}
f <- path(d, basename(u))

if (!file_exists(f)) {
  download.file(u, f, mode = "wb")

  zip::unzip(f, exdir = d)
}

fs <- dir_ls(d, regexp = ".*1908.*.gdb")

# Chena & Tanana River intersection
point <- sf::st_sfc(
  sf::st_point(c(-147.911472, 64.796633)),
  crs = 4269
)

i <- hydroloom::index_points_to_lines(fl, point)

# Map of NHD data model:
# https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/atoms/files/NHD%20v2.3%20Model%20Poster%2006012020.pdf
fc <- sf::read_sf(fs, "NHDFCode") |>
  select(fcode, description) |>
  rename(fcode_description = description)

# Get flowlines from FileGDB
fl <- sf::read_sf(fs, "NHDFlowline") |>
  # Don't include "Coastline"
  filter(ftype != 566) |>
  select(
    permanent_identifier, # 40-char unique identifier
    gnis_id, # GNIS unique identifier
    gnis_name, # Proper name of geographic entity
    reachcode, # first 8 chars = subbasin code (FIPS 103), next 6 random
    flowdir, # direction of flow relative to coordinate order. 1 = WithDigitized, 0 = Unintialized (?!)
    wbarea_permanent_identifier, # pid of waterbody through which this flows
    ftype, # unique identifier of feature type
    fcode # feature type, characteristics, values
  ) |>
  rename(
    id = permanent_identifier,
    wbid = wbarea_permanent_identifier
  ) |>
  # inner_join(i, by = "id") |>
  left_join(fc, by = "fcode") |>
  mutate(
    # required for hydroloom funcs:
    aggregate_id_from_measure = 0,
    aggregate_id_to_measure = 100
  )

st_geometry(fl) <- "wkb_geometry"

sub <- fl %>% inner_join(i, by = "id")

# to / from flow table, filtered to segments in fl
full_flow_table <- sf::read_sf(fs, "NHDFlow") |>
  select(from_permanent_identifier, to_permanent_identifier) |>
  rename(
    id = from_permanent_identifier,
    toid = to_permanent_identifier
  )

filtered_flow_table <- full_flow_table |>
  inner_join(
    fl %>%
      st_drop_geometry() %>%
      select(id),
    by = "id"
  ) |>
  mutate(
    toid = ifelse(!toid %in% id, "", toid),
    r = row_number()
  )

remove <- left_join(
  filtered_flow_table, filtered_flow_table,
  by = c("toid" = "id"),
  relationship = "many-to-many"
) |>
  filter(id == toid.y) |>
  pull(r.x)

flow_table <- filter(filtered_flow_table, !r %in% remove) |>
  select(-r)

# Traverse the flow network for all upstream segments
up <- navigate_network_dfs(flow_table, sub$id, direction = "up")

# Filter flowlines to these segments
subup <- fl[fl$id %in% unique(unlist(up)), ] |>
  inner_join(flow_table, by = "id") |>
  select(
    id, toid,
    gnis_id, gnis_name, reachcode, wbid,
    ftype, fcode, fcode_description,
    wkb_geometry
  )

mapview::mapview(list(subup, point))

# At this point, I think we can use this sf data frame for getting distances
# to the Chena River. First, get the segment that the crossing is on:
crossing_point <- sf::st_sfc(
  sf::st_point(c(-147.2204, 65.0092)),
  crs = 4269
)

crossing_index <- hydroloom::index_points_to_lines(fl, crossing_point)

crossing_segment <- fl %>%
  inner_join(crossing_index, by = "id")

# Now do the same thing as above, but go down
chena_flow_table <- subup |>
  st_drop_geometry() |>
  select(id, toid)

downstream_crossing <- navigate_network_dfs(
  chena_flow_table,
  crossing_segment$id,
  direction = "down"
)

subdown <- subup[subup$id %in% unique(unlist(downstream_crossing)), ] |>
  inner_join(chena_flow_table, by = c("id", "toid")) |>
  select(
    id, toid,
    gnis_id, gnis_name, reachcode, wbid,
    ftype, fcode, fcode_description,
    wkb_geometry
  )

That's about 90% of where I wanted to get. The next steps for my particular analysis are to get the length of subdown to where it intersects the Chena River (instead of to the intersection of the Chena and Tanana Rivers), which seems pretty easy for this segment, but will be more complicated if there is more than one path to the Chena.

Thanks very much!

dblodgett-usgs added a commit to DOI-USGS/hydroloom that referenced this issue Apr 20, 2023
@dblodgett-usgs
Copy link
Collaborator

Very cool. There's plenty of other smarts that could be added if we do a little more network derivative calculation. The prototype workflow I linked previously is actually going to get plowed into the NHD in the long run. So better capabilities are coming.

For the record, I did tick off a few of the things I mentioned doing up above. They are now on the latest release of hydroloom.

https://github.com/DOI-USGS/hydroloom/blob/main/vignettes/flow-table.Rmd is the vignette I got out of the deal. Needs writing and some more description of what's happening but this was a great excuse to work that up. Thanks for the prompt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants