RemoteSensing遥感影像分割，多波段train_demo.py报错 #297

KuntaHu · 2020-06-19T16:08:13Z

您好，按照教程，将多波段转为npy格式保存好。同时配置好了数据文件格式。
因为输入有6个波段所以训练时候设置为--channel 6，然后运行train_demo.py
有如下报错：

2020-06-19 23:58:38 [INFO] 40 samples in file data/dataset/train.txt
2020-06-19 23:58:38 [INFO] 30 samples in file data/dataset/val.txt
W0619 23:58:39.167647 1220 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 9.0
W0619 23:58:39.171162 1220 device_context.cc:245] device: 0, cuDNN Version: 7.3.
2020-06-19 23:58:40,660-INFO: Instantiated empty configuration.
HDFS initialization failed, please check if .hdfscli，cfg exists.
Exception in thread Thread-6:
Traceback (most recent call last):
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/aistudio/contrib/RemoteSensing/readers/base.py", line 85, in handle_worker
r = mapper(sample[0], sample[1], sample[2])
File "/home/aistudio/contrib/RemoteSensing/transforms/transforms.py", line 68, in call
outputs = op(im, im_info, label)
File "/home/aistudio/contrib/RemoteSensing/transforms/transforms.py", line 488, in call
im = normalize(im, self.min_val, self.max_val, mean, std)
File "/home/aistudio/contrib/RemoteSensing/transforms/ops.py", line 25, in normalize
im = (im.astype(np.float32, copy=False) - min_value) / range_value
ValueError: operands could not be broadcast together with shapes (256,256,6) (3,)

后来尝试选取3个波段，保存为npy格式，将--channel设置为3，重新读取数据，运行正确。说明我的数据文件格式是正确的。难道train_demo.py只能设定3波段输入？请问如何输入多波段呢？

谢谢！

KuntaHu · 2020-06-20T09:00:19Z

后来查明原因，是因为我自备的数据已经归一化了，然后transoforms里面的normalize报错。需要将数据集转为0-255，但是标注集还是0-1的单通道图片。这样就可以运行通过了。但是当我运行train_demo的时候（将数据换成自制的多通道数据），想fine-tune Unet模型时候，loss不下降，维持在1左右，同时IoU一直未0.5，kappa为-1。利用训练之后的模型预测结果都是0.

请问我改如何利用train_demo或者说remotesensing已经训练好的cloud分割的模型去fine-tune迁移到我的数据集上？谢谢

chang-png · 2020-06-22T12:58:59Z

你好，能分享一下转换为npy格式的数据代码吗，我转换的数据中总有一部分呢数据没有转换，谢谢

LutaoChu · 2020-06-23T06:08:36Z

后来查明原因，是因为我自备的数据已经归一化了，然后transoforms里面的normalize报错。需要将数据集转为0-255，但是标注集还是0-1的单通道图片。这样就可以运行通过了。但是当我运行train_demo的时候（将数据换成自制的多通道数据），想fine-tune Unet模型时候，loss不下降，维持在1左右，同时IoU一直未0.5，kappa为-1。利用训练之后的模型预测结果都是0.

请问我改如何利用train_demo或者说remotesensing已经训练好的cloud分割的模型去fine-tune迁移到我的数据集上？谢谢

你好，请问你使用的版本是develop还是release/v0.5.0呢？推荐使用最新的develop版本
关于训练loss问题，应该是配置没有配对，可以提供一下你的train_demo脚本，我看一下。
关于预训练模型迁移，迁移学习需要确保数据波段数和预训练模型波段数相同才行。如果你的数据集跟cloud的波段数相同，可以直接fine-tune。如果你的数据集不大，建议直接训练。

LutaoChu · 2020-06-23T06:16:37Z

你好，能分享一下转换为npy格式的数据代码吗，我转换的数据中总有一部分呢数据没有转换，谢谢

以下是tif转为npy的代码，供参考：

import gdal
import numpy as np
import os
import os.path as osp
import cv2


def readTifImg(fileName):
    dataset = gdal.Open(fileName)
    if dataset == None:
        raise Exception('can not open', fileName)
    im_data = dataset.ReadAsArray()
    return im_data
    

img_dir = 'xxx'
output_dir = 'xxx'
if not osp.exists(output_dir):
    os.makedirs(output_dir)
count = 0
for path,dir_list,file_list in os.walk(img_dir):
    for file_name in file_list:
        file = osp.join(path, file_name)
        img = readTifImg(file)
        img = img.transpose((1,2,0))

        output_file = osp.join(output_dir, file_name.rstrip('.tif'))
        np.save(output_file, img)
        count += 1
        if count % 10 == 0:
            print("current process: {}.".format(count))

print('total count = ', count)

最新的develop版本已支持tif、img、png、npy 4种格式，这4种格式的数据无需转换，可直接读取，推荐使用

KuntaHu · 2020-06-25T04:57:20Z

您好，我尝试使用develop版本，但是与0.5.0版本不同之处在于使用了gdal。采用conda安装一直失败，如下报错：

ERROR conda.core.link:_execute(637): An error occurred while installing package 'conda-forge::parso-0.7.0-pyh9f0ad1d_0'.
PermissionError(13, 'Permission denied')
Attempting to roll back.

Rolling back transaction: | WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pycache/init.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pycache/_compatibility.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pycache/cache.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pycache/file_io.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pycache/grammar.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pycache/normalizer.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pycache/parser.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pgen2/pycache/init.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pgen2/pycache/generator.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pgen2/pycache/grammar_parser.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/python/pycache/init.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/python/pycache/diff.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/python/pycache/errors.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/python/pycache/parser.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/python/pycache/pep8.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/python/pycache/prefix.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/python/pycache/token.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/python/pycache/tokenize.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/python/pycache/tree.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pycache/tree.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.6/site-packages/parso/pycache/utils.cpython-36.pyc. Please remove this file manually (you may need to reboot to free file handles)
done

[Errno 13] Permission denied: '/opt/conda/lib/python3.6/site-packages/parso-0.7.0.dist-info/AUTHORS.txt'

因此我已经将数据在本地转为npy了，所以comment掉import gdal之后，也是可以正常运行的。但还是希望能够正常安装gdal。

还有几个小疑问，

RemoteSensing里面可以切换model的backbone吗？以及RemoteSensing模型中Unet和hrnet默认的backbone？
Train 数据中的loss等评价指标是否可以保存，因为想绘制loss随epoch的下降曲线。请问PaddleSeg里面有实现代码吗？我观察到saved_model里面保存了一个log文件，请问有什么方法读取吗？

谢谢！

nepeplwu · 2020-06-29T02:31:33Z

@KuntaHu

Unet不支持backbone切换，hrnet可以切换不同大小的结构，在创建model时指定不同state的channel数量和module数量就行
接口参数见：https://github.com/PaddlePaddle/PaddleSeg/blob/release/v0.5.0/contrib/RemoteSensing/models/hrnet.py#L47
model.train接口有一个use_vdl的参数，设置为True会自动记录日志文件，接着可以通过visualdl的命令启动一个前端页面查看训练日志

Rayaction · 2020-11-09T02:39:12Z

img

你好，能分享一下转换为npy格式的数据代码吗，我转换的数据中总有一部分呢数据没有转换，谢谢

以下是tif转为npy的代码，供参考：

import gdal
import numpy as np
import os
import os.path as osp
import cv2


def readTifImg(fileName):
    dataset = gdal.Open(fileName)
    if dataset == None:
        raise Exception('can not open', fileName)
    im_data = dataset.ReadAsArray()
    return im_data
    

img_dir = 'xxx'
output_dir = 'xxx'
if not osp.exists(output_dir):
    os.makedirs(output_dir)
count = 0
for path,dir_list,file_list in os.walk(img_dir):
    for file_name in file_list:
        file = osp.join(path, file_name)
        img = readTifImg(file)
        img = img.transpose((1,2,0))

        output_file = osp.join(output_dir, file_name.rstrip('.tif'))
        np.save(output_file, img)
        count += 1
        if count % 10 == 0:
            print("current process: {}.".format(count))

print('total count = ', count)

最新的develop版本已支持tif、img、png、npy 4种格式，这4种格式的数据无需转换，可直接读取，推荐使用

tif、img、png、npy--》img是包括jpg吗？

LutaoChu · 2020-11-09T09:14:48Z

img是单独的一种格式，不包括jpg。
想读jpg可以用opencv或PIL，在read_img中添加几行代码即可
https://github.com/PaddlePaddle/PaddleSeg/blob/develop/contrib/RemoteSensing/readers/reader.py

LutaoChu · 2020-11-09T09:18:46Z

不过注意PIL库和opencv库在读取图片上的差异：
opencv：图片的通道顺序为BGR
PIL：通道顺序为RGB

Rayaction · 2020-11-09T11:52:07Z

不过注意PIL库和opencv库在读取图片上的差异：
opencv：图片的通道顺序为BGR
PIL：通道顺序为RGB

也就是说我需要再添加一个cvtcolor转换一下是吗

LutaoChu · 2020-11-10T02:40:12Z

看你是否需要进行转换了，需要的话就进行cvtcolor转换

Rayaction · 2020-11-10T07:18:55Z

看你是否需要进行转换了，需要的话就进行cvtcolor转换

我在reader部分加了这个判断后的操作
elif ext == '.jpg':
im_data = cv2.imread(img_path)
im_data = cv2.cvtColor(im_data, cv2.COLOR_BGR2RGB)
print(im_data.shape)
return np.transpose(im_data, [2, 0, 1]).astype(np.float32)
但是搞完也还是不对.paddle输入需要是chw 还是hwc
这个错:
ValueError: operands could not be broadcast together with shapes (1000,1000,256) (3,)

LutaoChu · 2020-11-10T09:34:29Z

paddle输入是nchw
去掉np.transpose操作试一下

Rayaction · 2020-11-10T12:19:48Z

貌似可行，但是我试了--train_batch_size 32到1都会提示Out of memory error on GPU 0. Cannot allocate 7.629395GB memory on GPU 0, available memory is only 5.876587GB.类似的超内存的问题，我的gpu是v100 32g的。。
bs设置为1的话又说不能allocate 100多兆，这是为啥。也给他了visible cuda devices

LutaoChu · 2020-11-10T12:37:48Z

是这样的，框架不是一次性allocate 32g的显存，而是多次allocate所需的显存。所以allocate 100多兆报错是正常的，说明最后申请的时候显存不够100多兆了

你的模型是不是太大了，或者图像尺寸太大了？

Rayaction · 2020-11-10T12:50:25Z

用的就是那个remote sensing的代码，图像大小256256的，模型是unet，应该不会吧

Rayaction · 2020-11-10T13:12:24Z

#574
这个能帮我看下吗，我下载下来的模型好像目录文件不太对的样子
加载时显示没找到

wuyefeilin assigned wuyefeilin and LutaoChu and unassigned wuyefeilin Jun 22, 2020

michaelowenliu closed this as completed Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RemoteSensing遥感影像分割，多波段train_demo.py报错 #297

RemoteSensing遥感影像分割，多波段train_demo.py报错 #297

KuntaHu commented Jun 19, 2020

KuntaHu commented Jun 20, 2020

chang-png commented Jun 22, 2020

LutaoChu commented Jun 23, 2020 •

edited

LutaoChu commented Jun 23, 2020 •

edited

KuntaHu commented Jun 25, 2020

nepeplwu commented Jun 29, 2020

Rayaction commented Nov 9, 2020

LutaoChu commented Nov 9, 2020

LutaoChu commented Nov 9, 2020

Rayaction commented Nov 9, 2020

LutaoChu commented Nov 10, 2020

Rayaction commented Nov 10, 2020 •

edited

LutaoChu commented Nov 10, 2020

Rayaction commented Nov 10, 2020

LutaoChu commented Nov 10, 2020

Rayaction commented Nov 10, 2020

Rayaction commented Nov 10, 2020 •

edited

RemoteSensing遥感影像分割，多波段train_demo.py报错 #297

RemoteSensing遥感影像分割，多波段train_demo.py报错 #297

Comments

KuntaHu commented Jun 19, 2020

KuntaHu commented Jun 20, 2020

chang-png commented Jun 22, 2020

LutaoChu commented Jun 23, 2020 • edited

LutaoChu commented Jun 23, 2020 • edited

KuntaHu commented Jun 25, 2020

nepeplwu commented Jun 29, 2020

Rayaction commented Nov 9, 2020

LutaoChu commented Nov 9, 2020

LutaoChu commented Nov 9, 2020

Rayaction commented Nov 9, 2020

LutaoChu commented Nov 10, 2020

Rayaction commented Nov 10, 2020 • edited

LutaoChu commented Nov 10, 2020

Rayaction commented Nov 10, 2020

LutaoChu commented Nov 10, 2020

Rayaction commented Nov 10, 2020

Rayaction commented Nov 10, 2020 • edited

LutaoChu commented Jun 23, 2020 •

edited

LutaoChu commented Jun 23, 2020 •

edited

Rayaction commented Nov 10, 2020 •

edited

Rayaction commented Nov 10, 2020 •

edited