Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy paragraphs elements from one document to another. #182

Open
warmspringwinds opened this issue May 31, 2015 · 6 comments
Open

Copy paragraphs elements from one document to another. #182

warmspringwinds opened this issue May 31, 2015 · 6 comments

Comments

@warmspringwinds
Copy link

I have to sort my paragraphs according to my needs.
I did it but now I can't overwrite to current document paragraphs
or add them to a new document.

Basically I have a list of paragraph objects that are in a different order and I need to put them
a document.

Is it possible?

@scanny
Copy link
Contributor

scanny commented Jun 1, 2015

Accomplishing this in the general case is tricky. However, if your paragraphs are simple in content, its possible to get it done with a simple approach.

There are two main approaches:

  1. Copy the content paragraph by paragraph, then write new paragraphs in the new document.
  2. Use deepcopy() to copy the lxml objects and insert them into the document at the lxml level.

With the first approach you'll need to go down to the run level to read and construct the new paragraphs, otherwise all text formatting will be lost.

@downtown12
Copy link

@scanny I encounter a similar problem. I need to copy a document's whole content including pictures, paragraphs, page breakers to another document. The second document need to be looked the same as the first one.

Can you explain in detail about the two approaches:

  1. As for the first one, how can it deal with pictures?
  2. As for the second one, how can I access lxml objects of the source document?

Great thanks in advance.

@berezovskyi
Copy link

@scanny, @downtown12, I have tried the following code and it works (preserving simple styles), but as it was stripping the list tags, I had to make an extremely ugly hack:

def direct_insert(doc_dest, doc_src):
    for p in doc_src.paragraphs:
        inserted_p = doc_dest._body._body._insert_p(p._p)
        if p._p.get_or_add_pPr().numPr:
            inserted_p.style = "ListNumber"

screenshot from 2015-11-17 21 51 00

Partly inspired by #217.

By the way, it cross-document editing something a future priority?

@scanny
Copy link
Contributor

scanny commented Nov 18, 2015

I'd say it's unlikely to appear soon unless someone decides to contribute it or sponsor it. It's a challenging feature set because a general case solution entails reconciling the styles across the two documents.

@berezovskyi
Copy link

@scanny thanks for an amazing work you've already done!

@Thebee23
Copy link

Thebee23 commented Sep 6, 2017

How can i extract numbering of multilevel list from a docx using python..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants