Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISPIMG函数下的图片如何获取? #363

Open
RexH0 opened this issue Oct 29, 2023 · 10 comments
Open

DISPIMG函数下的图片如何获取? #363

RexH0 opened this issue Oct 29, 2023 · 10 comments
Labels

Comments

@RexH0
Copy link

RexH0 commented Oct 29, 2023

图片嵌在函数里,如=DISPIMG("ID_61A4C7AEBBCE4C2AB56FC63352E00CBD",1),使用reader.listPictures();无法取到这些图片,有没有什么其他的办法?

@wangguanquan
Copy link
Owner

wangguanquan commented Oct 29, 2023 via email

@RexH0
Copy link
Author

RexH0 commented Oct 30, 2023

图片添加方式及demo已邮件发送
图片所处的位置应该跟 https://blog.csdn.net/maudboy/article/details/133145278 这个文章说的差不多

@wangguanquan
Copy link
Owner

收到。

我看了下,这种格式应该是wps自定义的格式并不兼容office,所以用office也无法打开这种图片。读取的话也是可以的,只是和现在的listPictures()方法有些不太一样,cellimages.xlm只包含图片并不包含row和col信息,row和col必须通过sheet.xml获取,所以需要一些特殊处理,要么在处理图片之前预先处理sheet.xml来获取行列信息,要么先读取cellimages.xlm并缓存起来,然后使用Row#getImage来获取图片

@RexH0
Copy link
Author

RexH0 commented Oct 30, 2023

麻烦给个demo ,昨天看了eec的 源码 没搞定

@wangguanquan
Copy link
Owner

wangguanquan commented Oct 30, 2023

ExcelReader除了静态read方法还可以实例化,所以我们可以继承ExcelReader并提供一个方法解析内嵌图片xml,解析的逻辑与现有XMLDrawings几乎一样,示例代码如下

public class WpsExcelReader extends ExcelReader {

    public WpsExcelReader(Path path) throws IOException {
        super(path);
    }

    /**
     * 拉取WPS单元格内嵌图片
     *
     * @return ID:图片本地路径
     */
    public Map<String, Path> listCellImages() {
        ZipEntry entry = getEntry(zipFile, "xl/cellimages.xml");
        if (entry == null) return Collections.emptyMap();
        SAXReader reader = SAXReader.createDefault();

        ZipEntry refEntry = getEntry(zipFile, "xl/_rels/cellimages.xml.rels");
        if (refEntry == null) return Collections.emptyMap();
        Document document;
        try {
            document = reader.read(zipFile.getInputStream(refEntry));
        } catch (DocumentException | IOException e) {
            e.printStackTrace();
            return null;
        }
        List<Element> list = document.getRootElement().elements();
        Relationship[] rels = new Relationship[list.size()];
        int i = 0;
        for (Element e : list) {
            rels[i++] = new Relationship(e.attributeValue("Id"), e.attributeValue("Target"), e.attributeValue("Type"));
        }
        RelManager relManager = RelManager.of(rels);

        Element cellImages;
        try {
            cellImages = reader.read(zipFile.getInputStream(entry)).getRootElement();
        } catch (IOException | DocumentException e) {
            e.printStackTrace();
            return null;
        }
        List<Element> images = cellImages.elements();
        Namespace xdr = cellImages.getNamespaceForPrefix("xdr"), a = cellImages.getNamespaceForPrefix("a");
        // 图片临时存放的位置
        if (tempDir == null) {
            try {
                tempDir = FileUtil.mktmp("eec+");
            } catch (IOException e) {
                throw new ExcelReadException("创建临时文件夹失败.", e);
            }
        }
        Map<String, Path> cellImageMapper = new HashMap<>(images.size());
        for (Element e : images) {
            Element pic = e.element(QName.get("pic", xdr));
            // Not a picture
            if (pic == null) continue;

            Element nvPicPr = pic.element(QName.get("nvPicPr", xdr));
            if (nvPicPr == null) continue;
            Element cNvPr = nvPicPr.element(QName.get("cNvPr", xdr));
            if (cNvPr == null) continue;
            String name = cNvPr.attributeValue("name");

            Element blipFill = pic.element(QName.get("blipFill", xdr));
            if (blipFill == null) continue;

            Element blip = blipFill.element(QName.get("blip", a));
            if (blip == null) continue;

            Namespace r = blip.getNamespaceForPrefix("r");
            String embed = blip.attributeValue(QName.get("embed", r));
            Relationship rel = relManager.getById(embed);
            if (r != null && Const.Relationship.IMAGE.equals(rel.getType())) {
                Path localPath = null;
                // 复制图片到临时文件夹
                entry = getEntry(zipFile, "xl/" + rel.getTarget());
                if (entry != null) {
                    try {
                        Path targetPath = tempDir.resolve(rel.getTarget());
                        if (!Files.exists(targetPath.getParent())) {
                            Files.createDirectories(targetPath);
                        }
                        Files.copy(zipFile.getInputStream(entry), targetPath, StandardCopyOption.REPLACE_EXISTING);
                        localPath = targetPath;
                    } catch (IOException ioException) {
                        ioException.printStackTrace();
                    }
                    cellImageMapper.put(name, localPath);
                }

            }
        }
        return cellImageMapper;
    }
}

读取的时候就可以使用自定义WpsExcelReader类

@Test public void testCellImages() throws IOException {
    try (WpsExcelReader reader = new WpsExcelReader(Paths.get("./嵌入图片测试.xlsx"))) {
        // 获取所有单元格内嵌图片
        Map<String, Path> cellImages = reader.listCellImages();
        for (Iterator<Row> iter = reader.sheet(0).header(17).rows().iterator(); iter.hasNext(); ) {
            Row row = iter.next();
            String v = row.getString(13);
            // 判断是否为DISPIMG
            if (StringUtil.isNotEmpty(v) && v.startsWith("=DISPIMG(\"")) {
                String key = v.substring(10, v.lastIndexOf('"'));
                // 通过ID匹配图片,拿到Path就是图片的内容
                System.out.println(key + ":" + cellImages.get(key));
            }
        }
    }
}

@RexH0
Copy link
Author

RexH0 commented Oct 30, 2023

完美拿到图片,感谢大佬🙏,期待这个功能出现在下一个release

@wangguanquan
Copy link
Owner

调查一下你们读取图片的场景,使用listPictures()获取所有图片之后怎么做?
我想表达的是通过listPictures()一次获取全部图片,还是通过Row#getPicture返回指定行列的图片,你觉得哪种比较实用更贴近实际开发?

另外可以将<嵌入图片测试.xlsx>示例文件使用wps另存为xls格式发我一份吗?要做成功能的话eec-e3-support也需要解析内嵌图片

@RexH0
Copy link
Author

RexH0 commented Oct 31, 2023

主要使用场景其实就是导入数据的时候同时把图片也导入到系统,
逐行获取图片可能会更好使用,
我使用listPictures()之后其实也是先把这个list上传到文件服务器并转换为Map<单元格坐标,fileUrl>,再去逐行循环 将fileUrl替换到rowCell中保存进数据库,所以在循环处理rowCell的时候能拿到图片更好,但是这种方式可能上传excel的性能会有一定影响,
xls 文件单独发你

@wangguanquan
Copy link
Owner

了解,你的使用场景使用listPictures()的确要好一些,可以异步批量上传图片,那后续内嵌图片依然通过listPictures()获取吧,功能统一还实用。

文件已收,感谢

wangguanquan pushed a commit that referenced this issue Nov 12, 2023
wangguanquan pushed a commit that referenced this issue Nov 12, 2023
@wangguanquan
Copy link
Owner

v0.5.12 支持xlsx下读取wps内嵌图片,使用现有方法listPictures即可。xls并不支持,目前没有找到wps关于cellImage的编码介绍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants