## Fetch and prepare FWF data

### 1. Fetch

Fetch PulseWaves data of a flight line from the NYU Spatial Data Repo.

In [1]:
line='115601'

!mkdir -p tmp
!grep '\<.*fwf_plswvs.*'$line'.*\>' ../data/metadata/d15-bitstreams.json | tr -d ',' | xargs wget -P tmp -nd --no-check-certificate -P tmp
!unzip tmp/*.zip -d tmp
!rm -rf tmp/*.zip
!find tmp -type f -exec mv {} "../data/pw" \;
!rm -rf tmp

--2020-07-30 18:49:07--  https://archive.nyu.edu/retrieve/79995/nyu_2451_38625_fwf_plswvs_F_150326_115601.zip
Resolving archive.nyu.edu (archive.nyu.edu)... 128.122.108.142
Connecting to archive.nyu.edu (archive.nyu.edu)|128.122.108.142|:443... connected.
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 200 OK
Length: 961431605 (917M) [application/octet-stream]
Saving to: ‘tmp/nyu_2451_38625_fwf_plswvs_F_150326_115601.zip’


2020-07-30 18:50:49 (9.06 MB/s) - ‘tmp/nyu_2451_38625_fwf_plswvs_F_150326_115601.zip’ saved [961431605/961431605]

Archive:  tmp/nyu_2451_38625_fwf_plswvs_F_150326_115601.zip
   creating: tmp/nyu_2451_38625_pulse/
  inflating: tmp/nyu_2451_38625_pulse/F_150326_115601.pls  
  inflating: tmp/nyu_2451_38625_pulse/F_150326_115601.wvs  


Check that the PulseWaves files have been downloaded and unpacked

In [2]:
!ls ../data/pw/*.pls
!ls ../data/pw/*.wvs

../data/pw/F_150326_115601.pls
../data/pw/F_150326_115601.wvs


Display the Pulse file header

In [3]:
pls_files = !ls ../data/pw/*.pls
first_pls_file = pls_files[0]
!java -jar "../jar/umg-cli-0.2.0-SNAPSHOT-jar-with-dependencies.jar" pulseinfo -i $first_pls_file

--------------------------------------------------
PULSE HEADER
--------------------------------------------------
fileSignature                  PulseWavesPulse
globalParameter                00000000
fileSourceID                   0
guidData1                      203075702
guidData2                      63941
guidData3                      18986
guidData4                      
systemIdentifier               RiPROCESS 1.6.5
generatingSoftware             PulseWaves DLL 0.3 r11 (140921) by rapidlasso
fileCreationDate               233
fileCreationYear               2015
versionMajor                   0
versionMinor                   3
headerSize                     352
offsetToPulseData              9271
numberOfPulses                 15638619
pulseFormat                    0
pulseAttributes                0
pulseSize                      48
pulseCompression               0
reserved                       0
numberOfVariableLengthRecords  18
numberOfAppededVariab

--------------------------------------------------
VLR #10 of 18
--------------------------------------------------
userID                         PulseWaves_Spec
recordID                       200005
reserved                       0
recordLengthAfterHeader        404
description                    PulseWaves 0.3 r11 (140921) by rapidlasso
raw data                       [404 bytes]
----- Pulse descriptor -----
size                           92
reserved                       0
opticalCenterToAnchorPoint     0
numberOfExtraWaveBytes         0
numberOfSamplings              3
sampleUnits                    1.000
compression                    0
scannerIndex                   1
description                    1 x RP, 2 x LP, 1 x HP

Sampling descriptor #0
size                           104
reserved                       0
type                           1
channel                        3
unused                         0
bitsForDurationFromAnchor      32
scaleForDur

### 2. Convert the data to PWMsg format

The Java module below pairs each pulse with its waveforms and encodes the data in a [Protocol Buffers](https://developers.google.com/protocol-buffers) format. The encoded data (i.e. PWMsg) are segmented to small chunks (e.g. 100000 pulses) to make them more managable. The format protocol is in the `data/protobuf/pulsewaves.proto` file.

In [4]:
!java -jar "../jar/umg-cli-0.2.0-SNAPSHOT-jar-with-dependencies.jar" pw2proto -idir ../data/pw -odir ../data/pwmsg -meta_file ../data/metadata/sp19-meta.json -segment 100000 -subseq 0 5000

2020-07-30 18:51:09,611 [main] DEBUG umg.core.lidar.protobuf.cli.PW2Proto - Offset: [977000.000, 173000.000, 0.000, , 0.000]
2020-07-30 18:51:09,612 [main] DEBUG umg.core.lidar.protobuf.cli.PW2Proto - Scale: [0.001, 0.001, 0.001, , 0.000]
2020-07-30 18:51:09,612 [main] DEBUG umg.core.lidar.protobuf.cli.PW2Proto - Flight ID map contains 85 entries.
2020-07-30 18:51:09,616 [main] DEBUG umg.core.lidar.protobuf.cli.PW2Proto - File source ID: 0
2020-07-30 18:51:09,618 [pool-1-thread-1] INFO  umg.core.lidar.protobuf.cli.PW2Proto - Processing /home/vvo/code/vvogit/sunsetpark/notebooks/../data/pw/F_150326_115601.pls
Processing /home/vvo/code/vvogit/sunsetpark/notebooks/../data/pw/F_150326_115601.pls
2020-07-30 18:51:09,621 [pool-1-thread-1] DEBUG umg.core.lidar.pulsewaves.PVariableLengthRecord - VLR length: 208
2020-07-30 18:51:09,621 [pool-1-thread-1] WARN  umg.core.lidar.protobuf.cli.PW2Proto - No support for vlr #34735
2020-07-30 18:51:09,621 [pool-1-thread-1] DEBUG umg.core.lidar.pulsewave

Check the PWMsg data 

In [5]:
!ls ../data/pwmsg

F_150326_115601-0.pwmsg


Check notebook [01_pwmsg-d15.ipynb](01_pwmsg-d15.ipynb) for how to work with the PWMsg files using Python