Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old GT3X ending parsed wrong #1

Closed
muschellij2 opened this issue Aug 31, 2020 · 2 comments
Closed

Old GT3X ending parsed wrong #1

muschellij2 opened this issue Aug 31, 2020 · 2 comments

Comments

@muschellij2
Copy link
Owner

There seems to be some overflow issue here or reading in some data I shouldn't be (like after the file length)

library(pygt3x)
url = "https://github.com/THLfi/read.gt3x/files/3522749/GT3X%2B.01.day.gt3x.zip"
destfile = tempfile(fileext = ".zip")
dl = download.file(url, destfile = destfile)
gt3x_file = unzip(destfile, exdir = tempdir())
gt3x_file = gt3x_file[!grepl("__MACOSX", gt3x_file)]
path = gt3x_file

for (i in 1:10) {
  py = pygt3x::py_read_gt3x(gt3x_file, verbose = 2)
  print(py$data[109910,])
  print(py$data[2699958:2699963,])
  stopifnot(max(abs(py$data)) < 6.1)
}
#> # A tibble: 1 x 3
#>       X     Y      Z
#>   <dbl> <dbl>  <dbl>
#> 1    -6  2.56 -0.047
#> # A tibble: 6 x 3
#>       X     Y     Z
#>   <dbl> <dbl> <dbl>
#> 1 0.469 0.707 0.522
#> 2 0.466 0.704 0.032
#> 3 0     0     0    
#> 4 0     0     0    
#> 5 0     0     0    
#> 6 0     0     0    
#> # A tibble: 1 x 3
#>       X     Y      Z
#>   <dbl> <dbl>  <dbl>
#> 1    -6  2.56 -0.047
#> # A tibble: 6 x 3
#>       X     Y     Z
#>   <dbl> <dbl> <dbl>
#> 1 0.469 0.707 0.522
#> 2 0.466 0.704 0.032
#> 3 0     0     0    
#> 4 0     0     0    
#> 5 0     0     0    
#> 6 0     0     0    
#> # A tibble: 1 x 3
#>       X     Y      Z
#>   <dbl> <dbl>  <dbl>
#> 1    -6  2.56 -0.047
#> # A tibble: 6 x 3
#>       X     Y     Z
#>   <dbl> <dbl> <dbl>
#> 1 0.469 0.707 0.522
#> 2 0.466 0.704 0.032
#> 3 0     0     0    
#> 4 0     0     0    
#> 5 0     0     0    
#> 6 0     0     0    
#> # A tibble: 1 x 3
#>       X     Y      Z
#>   <dbl> <dbl>  <dbl>
#> 1    -6  2.56 -0.047
#> # A tibble: 6 x 3
#>       X     Y     Z
#>   <dbl> <dbl> <dbl>
#> 1 0.469 0.707 0.522
#> 2 0.466 0.704 0.032
#> 3 0     0     0    
#> 4 0     0     0    
#> 5 0     0     0    
#> 6 0     0     0
#> Warning in pygt3x::py_read_gt3x(gt3x_file, verbose = 2): Really large values
#> of X/Y/Z- rerun and see if still there also open issue on https://github.com/
#> muschellij2/pygt3x/issues
#> # A tibble: 1 x 3
#>       X     Y      Z
#>   <dbl> <dbl>  <dbl>
#> 1    -6  2.56 -0.047
#> # A tibble: 6 x 3
#>        X      Y      Z
#>    <dbl>  <dbl>  <dbl>
#> 1  0.469  0.707  0.522
#> 2  0.466  0.704  0.032
#> 3 -3.77  93.6   49.4  
#> 4 93.6   38.4   -3.77 
#> 5 64.1   49.4   93.6  
#> 6 49.4   -3.77  89.7
#> Error: max(abs(py$data)) < 6.1 is not TRUE

Created on 2020-08-31 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Mojave 10.14.6        
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2020-08-31                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version     date       lib source                           
#>  assertthat    0.2.1       2019-03-21 [1] CRAN (R 4.0.0)                   
#>  backports     1.1.9       2020-08-24 [1] CRAN (R 4.0.0)                   
#>  callr         3.4.3       2020-03-28 [1] CRAN (R 4.0.0)                   
#>  cli           2.0.2       2020-02-28 [1] CRAN (R 4.0.0)                   
#>  crayon        1.3.4       2017-09-16 [1] CRAN (R 4.0.0)                   
#>  desc          1.2.0       2020-06-01 [1] Github (muschellij2/desc@b0c374f)
#>  devtools      2.3.1.9000  2020-08-25 [1] Github (r-lib/devtools@df619ce)  
#>  digest        0.6.25      2020-02-23 [1] CRAN (R 4.0.0)                   
#>  ellipsis      0.3.1       2020-05-15 [1] CRAN (R 4.0.0)                   
#>  evaluate      0.14        2019-05-28 [1] CRAN (R 4.0.0)                   
#>  fansi         0.4.1       2020-01-08 [1] CRAN (R 4.0.0)                   
#>  fs            1.5.0       2020-07-31 [1] CRAN (R 4.0.2)                   
#>  glue          1.4.1       2020-05-13 [1] CRAN (R 4.0.0)                   
#>  highr         0.8         2019-03-20 [1] CRAN (R 4.0.0)                   
#>  htmltools     0.5.0       2020-06-16 [1] CRAN (R 4.0.0)                   
#>  jsonlite      1.7.0       2020-06-25 [1] CRAN (R 4.0.0)                   
#>  knitr         1.29        2020-06-23 [1] CRAN (R 4.0.2)                   
#>  lattice       0.20-41     2020-04-02 [1] CRAN (R 4.0.2)                   
#>  lifecycle     0.2.0       2020-03-06 [1] CRAN (R 4.0.0)                   
#>  magrittr      1.5         2014-11-22 [1] CRAN (R 4.0.0)                   
#>  Matrix        1.2-18      2019-11-27 [1] CRAN (R 4.0.2)                   
#>  memoise       1.1.0       2017-04-21 [1] CRAN (R 4.0.0)                   
#>  pillar        1.4.6       2020-07-10 [1] CRAN (R 4.0.2)                   
#>  pkgbuild      1.1.0       2020-07-13 [1] CRAN (R 4.0.2)                   
#>  pkgconfig     2.0.3       2019-09-22 [1] CRAN (R 4.0.0)                   
#>  pkgload       1.1.0       2020-05-29 [1] CRAN (R 4.0.0)                   
#>  prettyunits   1.1.1       2020-01-24 [1] CRAN (R 4.0.0)                   
#>  processx      3.4.3       2020-07-05 [1] CRAN (R 4.0.0)                   
#>  ps            1.3.4       2020-08-11 [1] CRAN (R 4.0.2)                   
#>  purrr         0.3.4       2020-04-17 [1] CRAN (R 4.0.0)                   
#>  pygt3x      * 0.0.6.9000  2020-08-31 [1] local                            
#>  R6            2.4.1       2019-11-12 [1] CRAN (R 4.0.0)                   
#>  Rcpp          1.0.5       2020-07-06 [1] CRAN (R 4.0.0)                   
#>  remotes       2.2.0       2020-07-21 [1] CRAN (R 4.0.2)                   
#>  reticulate    1.16        2020-05-27 [1] CRAN (R 4.0.0)                   
#>  rlang         0.4.7.9000  2020-08-25 [1] Github (r-lib/rlang@de0c176)     
#>  rmarkdown     2.3         2020-06-18 [1] CRAN (R 4.0.0)                   
#>  rprojroot     1.3-2       2018-01-03 [1] CRAN (R 4.0.0)                   
#>  sessioninfo   1.1.1       2018-11-05 [1] CRAN (R 4.0.0)                   
#>  stringi       1.4.6       2020-02-17 [1] CRAN (R 4.0.0)                   
#>  stringr       1.4.0       2019-02-10 [1] CRAN (R 4.0.0)                   
#>  testthat      2.99.0.9000 2020-08-25 [1] Github (r-lib/testthat@6a24275)  
#>  tibble        3.0.3       2020-07-10 [1] CRAN (R 4.0.2)                   
#>  usethis       1.6.1.9001  2020-08-25 [1] Github (r-lib/usethis@860c1ea)   
#>  utf8          1.1.4       2018-05-24 [1] CRAN (R 4.0.0)                   
#>  vctrs         0.3.2       2020-07-15 [1] CRAN (R 4.0.2)                   
#>  withr         2.2.0       2020-04-20 [1] CRAN (R 4.0.0)                   
#>  xfun          0.16        2020-07-24 [1] CRAN (R 4.0.2)                   
#>  yaml          2.2.1       2020-02-01 [1] CRAN (R 4.0.0)                   
#> 
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library
@shaheen-syed
Copy link

I managed to read the content according the the NHANES format described here:
https://github.com/actigraph/NHANES-GT3X-File-Format/blob/master/fileformats/activity.bin.md

I converted the .gt3x file with ActiLife to a .csv file and the acceleration values do match up. Note that the order is YXZ whereas the ActilLife export is XYZ. However, there is repetition in the csv file of some values for a very long duration; idle mode? I guess this is the imputation you talked about? I am unable to find out where to identify what values need to be repeated for which duration. Any ideas?

Also, I'm reading the bytes and converting them to bits and then taking 12 bits to get a single acceleration value. So for every 3 bytes we get 24 bits, which is 2 acceleration values. In the end, I am left with a remainder which I simply trim.

Some quick code here:

	# file size
	file_size = os.path.getsize(file)
	# # total number of samples
	# num_samples = int(np.floor(file_size * 8 / 36))
	
	hz = 30
	axes = 3
	scale = 341.0

	# open the log.bin file in binary mode
	with open(file, mode='rb') as file:

		# hold acceleration values
		values = []

		# read bytes as bits
		payload_bits = Bits(bytes = file.read(file_size)).bin

		# extract 12 bits as 1 acceleration value and add them to a list
		for i in range(0,len(payload_bits),12):

			# check if i is larger than file size
			if i + 12 > file_size:
				break
			
			# extract the 12 bit as a string
			bitstring = payload_bits[i:i+12]

			# convert to acceleration value
			acc_value = Bits(bin=bitstring).int

			# add to list
			values.append(acc_value)

		# convert list to numpy array
		acc = np.array(values)
		
		# cut of last values
		acc_cut = len(acc) % (hz * axes)
		
		# cut of
		acc = acc[:-acc_cut]
		# resize to samples x axes
		acc = acc.reshape(len(acc) // 3, 3)
		# scale
		acc = acc / scale

@muschellij2
Copy link
Owner Author

I have made additions to gt3x over the past few days looking at this as well. I think https://github.com/muschellij2/gt3x/blob/master/gt3x/gt3x_functions.py#L622 should fix it. If the bits are 8, then it adds 0000 to the end to get the last value. I think truncating the time course to only num_samples = int(np.floor(file_size * 8 / 36)) points is what I need to do at the end of the day. I think the break you do as well may work but may lop off the last record. But I think that check is crucial because otherwise I think it reads after the file is ended and gives random results, but I think should be fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants