Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy image as duplicates #704

Closed
shimizukawa opened this issue Jan 2, 2015 · 10 comments
Closed

copy image as duplicates #704

shimizukawa opened this issue Jan 2, 2015 · 10 comments
Labels

Comments

@shimizukawa
Copy link
Member

Hello all,

I am making a website with plenty of figures, and I am very annoyed everytime I am running make html (or sphinx-build -b html -d _build/doctrees . _build/html MY_RST_FILES).

My figures are in a complete different folder (i.e. outside the website folder or tree).

When I run make html (or sphinx-build -b html -d _build/doctrees . _build/html MY_RST_FILES), and the images have never been copied in _images before, then they are copied in it and everything is fine.

But when I run make html (or sphinx-build -b html -d _build/doctrees . _build/html MY_RST_FILES), and the images have already been copied in _images before, then they are copied again and their name is change to fig_nameN.ext were N is an integer from 1 depending on how many duplicates are already there.

This was particularly annoying because of the time and storage waste.

In the module Sphinx-1.0.7-py2.5.egg/sphinx/util/osutil.py, I created a new function no_duplicate_copyfile:

#!python

import os
import stat
import time
import datetime

def no_duplicate_copyfile(source, dest):
    """Copy a file and its modification times, if possible and if file is newer."""
    # if file does not exist: copy
    if not os.path.exists(dest):
        copyfile(source, dest)
    # if file exist: check modification time
    else:
        # source
        stat_source = os.stat(source)
        year, month, day, hh, mm, ss = time.localtime(stat_source[stat.ST_MTIME])[:6]
        modification_time_source = datetime.datetime(year, month, day, hh, mm, ss)
        # dest
        stat_dest = os.stat(dest)
        year, month, day, hh, mm, ss = time.localtime(stat_dest[stat.ST_MTIME])[:6]
        modification_time_dest = datetime.datetime(year, month, day, hh, mm, ss)
        # old < new = True
        # new < old = False
        # if dest = older and source = newer, then copy
        if modification_time_dest < modification_time_source:
            copyfile(source, dest)

My problem is solved and I hope that this solution may help someone else. If another solution already existed, I'd be interested in knowing.

Thanks
Christophe.


@shimizukawa
Copy link
Member Author

From Anonymous on 2011-05-31 01:57:22+00:00

+1

Also:

  • If you manually delete all the images in the _build/html/_images folder (instead of doing a 'make clean') and then re-compile, it still renumbers! So if you have for example Image.png to Image35.png in there, and delete them all... the build still generates Image36.png instead of just Image.png -- so somewhere it's keeping track of what's there.
  • If you have images in subfolders in your source, and some images have the same name, the build ignores the subfolders and instead renames them by adding a number. So if you have images/subfolder1/Image.png and images/subfolder2/Image.png in your source, the build will generate _build/html/_images/Image.png and _build/html/_images/Image1.png

@shimizukawa
Copy link
Member Author

From Anonymous on 2011-07-07 12:50:17+00:00

I have the same issue. I'm including one image, but it's creating 32(!!!) duplicates of it, and linking to the latest one in the html. Going to give your function a try, thanks.

@shimizukawa
Copy link
Member Author

From Vadi on 2011-07-07 13:02:02+00:00

The original fix didn't work for me though. I added the new method, changed the import in init and the call to it, yet it still duplicates files.

@shimizukawa
Copy link
Member Author

From adamgreenhall on 2011-08-09 19:22:34+00:00

Same problem as Vadi. Christophe what else did you do besides adding the no_duplicate_copyfile function?

@shimizukawa
Copy link
Member Author

From Alain Spineux on 2011-09-15 10:58:18+00:00

I suffer the same problem.

Here is what I do:
I put my images ** in //_static**// directory referenced by html_static_path

and I have modified //post_process_images()// in builders/init.py to no handle images that are already in _static directory.

{{{
#!python

--- builders/init.py.orig 2011-09-15 12:35:50.502528977 +0200
+++ builders/init.py 2011-09-15 12:35:29.319501094 +0200
@@ -148,6 +148,13 @@
continue
node['uri'] = candidate
else:

  •            # ASX don't handle image that are already in self.config.html_static_path
    
  •            is_static=False
    
  •            for path in self.config.html_static_path:
    
  •                is_static=node['candidates']['*'].startswith(path)
    
  •            if is_static:
    
  •                continue
    
    •        candidate = node['uri']
         if candidate not in self.env.images:
             # non-existing URI; let it alone
      

}}}

This is part of the solution, but the final solution require more works :

  • compare file content
  • smart name mangling

Alain Spineux

@shimizukawa
Copy link
Member Author

From Georg Brandl on 2011-09-22 10:02:49+00:00

Fix #704: image file duplication bug.

→ <<cset 020daea>>

@shimizukawa
Copy link
Member Author

From Georg Brandl on 2011-09-22 10:14:35+00:00

Removing milestone: 1.0 (automated comment)

@shimizukawa
Copy link
Member Author

From Anonymous on 2012-07-22 06:00:47+00:00

Hi,

we are seeing the same problem on ubuntu 12.04 LTS, python-sphinx package version 1.1.3+dfsg-2ubuntu2.1, resulting in a duplicate set of images for each of our 8 language translations.. 45mb of docs balloon to 440mb!

files: http://trac.osgeo.org/osgeo/browser/livedvd/gisvm/trunk/doc

out ticket: https://trac.osgeo.org/osgeo/ticket/952

thanks,
Hamish
/
The OSGeo Live DVD project
http://live.osgeo.org

@shimizukawa
Copy link
Member Author

From Angelos Tzotsos on 2014-06-10 08:44:27+00:00

Hi,

We are still having the same problem in Ubuntu 14.04 and Sphinx 1.2.2 as Hamish mentioned above.

Thanks,
Angelos / The OSGeo Live DVD project http://live.osgeo.org

@shimizukawa
Copy link
Member Author

From Takayuki Shimizukawa on 2014-06-11 11:17:31+00:00

Hamish, Angelos Tzotsos, I can't reproduce the behavior this issue mentioned.

If you mean that sphinx should share the image directory for each build output, I think it's a new proposal.

Please make another ticket. Or, your contribution is always welcome :)

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant