## Fetch and prepare FWF data

### 1. Fetch

Fetch PulseWaves data of a flight line from the NYU Spatial Data Repo.

In [1]:
line='161348'

!mkdir -p tmp
!grep '\<.*fwf_plswvs.*'$line'.*\>' ../data/metadata/sp19-bitstreams.json | tr -d ',' | xargs wget -P tmp -nd --no-check-certificate -P tmp
!unzip tmp/*.zip -d tmp
!rm -rf tmp/*.zip
!find tmp -type f -exec mv {} "../data/pw" \;
!rm -rf tmp

--2020-07-27 23:27:50--  https://archive.nyu.edu/bitstream/2451/60462/5/nyu_2451_60462_fwf_plswvs_FD_190511_161348.zip
Resolving archive.nyu.edu (archive.nyu.edu)... 128.122.108.142
Connecting to archive.nyu.edu (archive.nyu.edu)|128.122.108.142|:443... connected.
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 200 OK
Length: 527670310 (503M) [application/octet-stream]
Saving to: ‘tmp/nyu_2451_60462_fwf_plswvs_FD_190511_161348.zip’


2020-07-27 23:28:30 (12.6 MB/s) - ‘tmp/nyu_2451_60462_fwf_plswvs_FD_190511_161348.zip’ saved [527670310/527670310]

Archive:  tmp/nyu_2451_60462_fwf_plswvs_FD_190511_161348.zip
  inflating: tmp/nyu_2451_60462_fwf_plswvs_FD_190511_161348/10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.pls  
  inflating: tmp/nyu_2451_60462_fwf_plswvs_FD_190511_161348/10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.wvs  


Check that the PulseWaves files have been downloaded and unpacked

In [2]:
!ls ../data/pw/*.pls
!ls ../data/pw/*.wvs

../data/pw/10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.pls
../data/pw/10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.wvs


Display the Pulse file header

In [3]:
pls_files = !ls ../data/pw/*.pls
first_pls_file = pls_files[0]
!java -jar "../jar/umg-cli-0.2.0-SNAPSHOT-jar-with-dependencies.jar" pulseinfo -i $first_pls_file

--------------------------------------------------
PULSE HEADER
--------------------------------------------------
fileSignature                  PulseWavesPulse
globalParameter                00000000
fileSourceID                   0
guidData1                      1900857167
guidData2                      41870
guidData3                      17123
guidData4                      
systemIdentifier               RiPROCESS 1.8.5
generatingSoftware             PulseWaves DLL 0.3 r11 (150617) by rapidlasso
fileCreationDate               149
fileCreationYear               2019
versionMajor                   0
versionMinor                   3
headerSize                     352
offsetToPulseData              9338
numberOfPulses                 4808941
pulseFormat                    0
pulseAttributes                0
pulseSize                      48
pulseCompression               0
reserved                       0
numberOfVariableLengthRecords  18
numberOfAppededVariableLengthRecords 0
tScaleF

### 2. Convert the data to PWMsg format

The Java module below pairs each pulse with its waveforms and encodes the data in a [Protocol Buffers](https://developers.google.com/protocol-buffers) format. The encoded data (i.e. PWMsg) are segmented to small chunks (e.g. 100000 pulses) to make them more managable. The format protocol is in the `data/protobuf/pulsewaves.proto` file.

In [4]:
!java -jar "../jar/umg-cli-0.2.0-SNAPSHOT-jar-with-dependencies.jar" pw2proto -i ../data/pw -o ../data/pwmsg -meta_file ../data/metadata/sp19-meta.json -segment 100000

2020-07-27 23:40:01,673 [main] DEBUG umg.core.lidar.protobuf.cli.PW2Proto - Offset: [977000.000, 173000.000, 0.000, , 561600.000]
2020-07-27 23:40:01,674 [main] DEBUG umg.core.lidar.protobuf.cli.PW2Proto - Scale: [0.001, 0.001, 0.001, , 0.000]
2020-07-27 23:40:01,674 [main] DEBUG umg.core.lidar.protobuf.cli.PW2Proto - Flight ID map contains 85 entries.
2020-07-27 23:40:01,677 [main] DEBUG umg.core.lidar.protobuf.cli.PW2Proto - File source ID: 1
2020-07-27 23:40:01,679 [pool-1-thread-1] INFO  umg.core.lidar.protobuf.cli.PW2Proto - Processing /home/vvo/scratch/sunset/fwf/notebooks/../data/pw/10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.pls
Processing /home/vvo/scratch/sunset/fwf/notebooks/../data/pw/10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.pls
2020-07-27 23:40:01,682 [pool-1-thread-1] DEBUG umg.core.lidar.pulsewaves.PVariableLengthRecord - VLR length: 216
2020-07-27 23:40:01,682 [pool-1-thread-1] WARN  umg.core.lidar.p

2020-07-27 23:40:31,970 [pool-1-thread-1] INFO  umg.core.lidar.protobuf.cli.PW2Proto - 10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.pls: Processed 2600000 pulses
2020-07-27 23:40:33,095 [pool-1-thread-1] INFO  umg.core.lidar.protobuf.cli.PW2Proto - 10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.pls: Processed 2700000 pulses
2020-07-27 23:40:34,279 [pool-1-thread-1] INFO  umg.core.lidar.protobuf.cli.PW2Proto - 10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.pls: Processed 2800000 pulses
2020-07-27 23:40:35,327 [pool-1-thread-1] INFO  umg.core.lidar.protobuf.cli.PW2Proto - 10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.pls: Processed 2900000 pulses
2020-07-27 23:40:36,417 [pool-1-thread-1] INFO  umg.core.lidar.protobuf.cli.PW2Proto - 10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints.pls: Processed 3000000 pulses
2020-07-27 23:40:37,573 [

Check the PWMsg data 

In [5]:
!ls ../data/pwmsg

10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-0.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-10.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-11.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-12.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-13.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-14.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-15.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-16.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-17.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-18.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Resolved-Scanner1-190511_161348_1-originalpoints-19.pwmsg
10552_NYU_M2_Pulse_Waves_MTA_Reso

Check notebook `01_fwf_inspect` for how to work with the PWMsg files using Python.