New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unfold #8
Unfold #8
Conversation
Dear HeXu,
I am very excited by the unfolding code, thanks a lot!
The parser was not designed to handle big files, ie: larger than 200 MB.
The main issue with such a big files is not speed or the parsing, it is the
available memory and python/matplotlib are very memory demanding, often
requiring more memory than the availble. The current parser was written for
stability, and the regular expressions are used to make the code a little
bit more resilient to corrupted data.
The filter module was written to deal with huge files, this class doesn't
use regular expressions (may be a little bit, I don't remember) and avoids
to load the whole file at once (memory is my main limitation). Also
filtering makes the new PROCAR file conceptually clearer, for instance
decreasing the number of entries to surface+bulk or impurity+host, etc.
The best part is that you only need to filter the file once.
Of course we could change the design of the parsing, or keep the original
version as a fallback
Best regards,
Francisco
…On Wed, Feb 20, 2019 at 12:46 PM mailhexu ***@***.***> wrote:
Hi Uthpala,
The band unfolding is added. List of changes:
- procarunfold module added.
- scriptUnfold.py added.
- documentation (docx file, is it the right place?)
- I found the parser a bit slow so I put my simple version in as
readFile2. From your comment, I see that there could be some problem with 1
atom case but I don't know why. I think the extensive use of re.findall()
is guilty for the slowness, for large files it can cost minutes, or even
hours to parse.
Cheers,
HeXu
------------------------------
You can view, comment on, or merge this pull request online at:
#8
Commit Summary
- merge changes from Uthpala for coeff with phase
- add unfold module
- add ProcarUnfolder
- add scriptUnfold.py and add supercell_matrix option to kpath
- update Manual
- fix typo
- clean
- typo
File Changes
- *M* .gitignore
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-0> (120)
- *M* changelog.txt
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-1> (2)
- *M* docs/PyProcar_Manual.docx
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-2> (0)
- *M* pyprocar/__init__.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-3> (2)
- *M* pyprocar/procarparser/procarparser.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-4> (903)
- *M* pyprocar/procarselect/procarselect.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-5> (9)
- *A* pyprocar/procarunfold/__init__.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-6> (1)
- *A* pyprocar/procarunfold/fatband.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-7> (133)
- *A* pyprocar/procarunfold/procar_unfolder.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-8> (85)
- *A* pyprocar/procarunfold/procarparser.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-9> (488)
- *A* pyprocar/procarunfold/unfolded_band.png
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-10> (0)
- *A* pyprocar/procarunfold/unfolder.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-11> (125)
- *M* pyprocar/scriptKpath.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-12> (6)
- *A* pyprocar/scriptUnfold.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-13> (78)
- *M* pyprocar/utilsprocar/utilsprocar.py
<https://github.com/romerogroup/pyprocar/pull/8/files#diff-14> (452)
Patch Links:
- https://github.com/romerogroup/pyprocar/pull/8.patch
- https://github.com/romerogroup/pyprocar/pull/8.diff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACyUyT41NfkiWf__HDNF54uYIoDt4-7Mks5vPW3zgaJpZM4bFhVk>
.
--
Francisco Munoz
Assistant Professor
Faculty of Sciences,
Department of Physics
University of Chile
+562 29787414
|
Merged He Xu's commits to complexparser branch. |
@fvmunoz |
Hi HeXu and Utphala,
sorry for reviving an old email, but yesterday I face the *extremely* slow
parsing of the PROCAR file with phase factors (LORBIT=12). It is a way too
slow to be usable and has nothing to do with the "filter" as I told you
earlier.
A very easy fix is to insert a couple of lines between lines 308-309 of
procarparser/procarparser.py:
(308) else:
(new) # get rid of phase factors
(new) self.spd = re.findall(r"ion.+tot\n([-.\d\seto]+)",
self.fileStr)
(new) self.spd = ''.join(self.spd)
(309) self.spd = re.findall(r"([-.\d\se]+tot.+)\n", self.spd)
I tested in every case I can think and it works fine and is several times
faster than the previous version. Uthpala, could you make the changes on
github please?
…On Wed, Mar 20, 2019 at 10:51 AM mailhexu ***@***.***> wrote:
@fvmunoz <https://github.com/fvmunoz>
Dear Francisco,
Thanks for your comments. And sorry for reply so late. I only got the time
to learn how to use the filter now. The unfolding is done in a point by
point manner (a point means a (ikpt, iband, ispin)). If the filter is
applied to (ikpt, iband, ispin), it gives exactly one line in PROCAR, since
one point happens to be written in one line in the PROCAR. Thus for the
unfolding, it is equivalent to read the file line by line. If we use an
iterator instead of a list, the memory in need is neglectable.
Best regards,
HeXu
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACyUyRYPBeiF9ToZYVum_icLhIq3WYktks5vYjzXgaJpZM4bFhVk>
.
--
Francisco Munoz
Assistant Professor
Faculty of Sciences,
Department of Physics
University of Chile
+562 29787414
|
Thanks for bringing this up. I will include this fix.
Best,
Uthpala
…On Tue, Jun 4, 2019 at 10:05 AM fvmunoz ***@***.***> wrote:
Hi HeXu and Utphala,
sorry for reviving an old email, but yesterday I face the *extremely* slow
parsing of the PROCAR file with phase factors (LORBIT=12). It is a way too
slow to be usable and has nothing to do with the "filter" as I told you
earlier.
A very easy fix is to insert a couple of lines between lines 308-309 of
procarparser/procarparser.py:
(308) else:
(new) # get rid of phase factors
(new) self.spd = re.findall(r"ion.+tot\n([-.\d\seto]+)",
self.fileStr)
(new) self.spd = ''.join(self.spd)
(309) self.spd = re.findall(r"([-.\d\se]+tot.+)\n", self.spd)
I tested in every case I can think and it works fine and is several times
faster than the previous version. Uthpala, could you make the changes on
github please?
On Wed, Mar 20, 2019 at 10:51 AM mailhexu ***@***.***>
wrote:
> @fvmunoz <https://github.com/fvmunoz>
> Dear Francisco,
> Thanks for your comments. And sorry for reply so late. I only got the
time
> to learn how to use the filter now. The unfolding is done in a point by
> point manner (a point means a (ikpt, iband, ispin)). If the filter is
> applied to (ikpt, iband, ispin), it gives exactly one line in PROCAR,
since
> one point happens to be written in one line in the PROCAR. Thus for the
> unfolding, it is equivalent to read the file line by line. If we use an
> iterator instead of a list, the memory in need is neglectable.
> Best regards,
> HeXu
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#8 (comment)>,
> or mute the thread
> <
https://github.com/notifications/unsubscribe-auth/ACyUyRYPBeiF9ToZYVum_icLhIq3WYktks5vYjzXgaJpZM4bFhVk
>
> .
>
--
Francisco Munoz
Assistant Professor
Faculty of Sciences,
Department of Physics
University of Chile
+562 29787414
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#8?email_source=notifications&email_token=AG3ZXMKNESBAWIHW6LVKYHDPYZY7FA5CNFSM4GYWCVSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW4VQGQ#issuecomment-498685978>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AG3ZXMJXU7JEWSZLNOVLX4TPYZY7FANCNFSM4GYWCVSA>
.
--
Uthpala Herath
Graduate Research Assistant
Department of Physics and Astronomy
West Virginia University
Morgantown, WV 26505
Tel. (304) 216-2535
Email: ukh0001@mix.wvu.edu <ukherathmudiyanselag@mix.wvu.edu>
herathuk@gmail.com
|
Hi Uthpala,
The band unfolding is added. List of changes:
Cheers,
HeXu