Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support multi-line notes in logbook entry #4

Closed
wants to merge 11 commits into from
Closed

support multi-line notes in logbook entry #4

wants to merge 11 commits into from

Conversation

jethrokuan
Copy link
Contributor

@jethrokuan jethrokuan commented Dec 4, 2017

This PR sets the stage for parsing of logbook entries. The goal is to enable logbook parsing, whie ensuring the existing tests pass.

@jethrokuan
Copy link
Contributor Author

As of now, the bulk of it works, but 2 tests are failing, and I'm stumped. @nevenz could you pull it down and take a look? The bulk of it are indentation issues, which I can't seem to resolve:

testRich1

org.junit.ComparisonFailure: expected:<...Wed +1d>
  :LOGBOOK:[  
  - Note taken on [2015-02-07 Sat 20:58] \\
    This is a multi
    line note.
  - State "DONE"       from "TODO"       [2015-02-07 Sat 20:47]
  CLOCK: [2015-02-07 Sat 20:47]--[2015-02-07 Sat 20:47] =>  0:00
  - State "DONE"       from "TODO"       [2015-02-07 Sat 20:47]
  :END:      ]
  - Note taken on [...> but was:<...Wed +1d>
  :LOGBOOK:[
  - Note taken on [2015-02-07 Sat 20:58] \\
    This is a multi
    line note.
  - State "DONE"       from "TODO"       [2015-02-07 Sat 20:47]
  CLOCK: [2015-02-07 Sat 20:47]--[2015-02-07 Sat 20:47] =>  0:00
  - State "DONE"       from "TODO"       [2015-02-07 Sat 20:47]
  :END:]
testParserHook

org.junit.ComparisonFailure: expected:<...ng

** Another note
[:LOGBOOK:
- Blah
]:END:

**** Grandchi...> but was:<...ng

** Another note
[   :LOGBOOK:
   - Blah
   ]:END:

FWIW I think the latter has to do with some special handling for org-indent-mode which wasn't carried over to logbooks

@nevenz
Copy link
Member

nevenz commented Dec 4, 2017

These failures look good to me. As in - code wins and tests need to be updated.

testParserHook failed because LOGBOOK is not part of the content anymore, so it needs to be indented, which is what (properly) happened.

testRich1 failed because of some trailing spaces after :LOGBOOK: and :END: in the original file. Same as the above - previously part of the content, so they couldn't be modified. But now we don't care, so things get cleaned up.

@@ -42,4 +42,7 @@
public static final Pattern HEAD_TAGS_P = Pattern.compile("^(.*)\\s+:(\\S+):\\s*$");

public static final Pattern PROPERTY = Pattern.compile("^:([^:\\s]+):\\s+(.*)\\s*$");

public static final Pattern LOGBOOK_NOTE_P = Pattern.compile("^- Note taken on " + DT);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nevenz I don't like this regexp: in particular it doesn't match the full line. I couldn't seem to add anything after DT, that would cause the regexp to fail matching everything. Any ideas?

@jethrokuan
Copy link
Contributor Author

Cleaned up the tests so that they pass, I think my next PR will be setting up travis to run the tests.

@jethrokuan jethrokuan changed the title [WIP] support multi-line notes in logbook entry support multi-line notes in logbook entry Dec 6, 2017
@jethrokuan
Copy link
Contributor Author

@nevenz i've fixed the bad regex, and added some unit tests, I think this is good to go now.

@nevenz
Copy link
Member

nevenz commented Dec 7, 2017

We should try to handle some badly formatted drawers. I've made a mess of my files quite a few times either by entering random command by accident or misplacing characters.

For example - missing :END::

* Note 1
:LOGBOOK:
- Note taken on [2017-12-07 Thu 11:46] \\
  next line starts with a star
* Note 2

These should be two notes, I think?

Or having :END:aaa instead of :END: etc.

Also, the other way around - when something should be part of the note, but it isn't:

* Note
:LOGBOOK:
- Note taken on [2017-12-07 Thu 11:46] \\
  next line stars with dash
  - Trying to confuse the parser
:END:

Perhaps user wanted a list inside a note (...inside the note). Remembering :LOGBOOK:'s column number and using that when searching for - could work?

There are probably more cases like this, we should try adding tests for anything crazy that comes to mind.

It can get tricky, but the top priority is that no data is lost. Worst case - drawer (or what we think it might be drawer, but couldn't quite get there) ends up being part of the content, like right now. So if we don't know how to handle something, we simply fall back to that.

BTW, can we do without mark() and reset()? This sounds like things might fail if logbook is too large?

@jethrokuan
Copy link
Contributor Author

For example - missing :END:

I think I handle this already, but I shall add test cases for these.

Or having :END:aaa instead of :END: etc.

I'm not sure we want to be so resilient with parsing on these malformed drawers. This can lead to a lot of complexity.

I'll handle the second case you brought up though.

It can get tricky, but the top priority is that no data is lost.

This should already be handled properly. Anything that fails to get picked up ends up as a generic string logbook entry. Anything that gets picked up has a 1-1 mapping between the object and the text representation.

BTW, can we do without mark() and reset()? This sounds like things might fail if logbook is too large?

I'm not too worried about this, because the read ahead is at max a line ahead. That said, this was the most elegant solution I can think of. I don't actually really know Java, so if you know a better way let me know.

@jethrokuan
Copy link
Contributor Author

@nevenz is this alright?

@nevenz
Copy link
Member

nevenz commented Dec 13, 2017

I'll see if I can come up with tests that are failing, if not I'll merge it.

One thing we'll need is a method for adding the latest log entry to OrgHead. They are stored in reverse order. I think that's what addLog should be, as it's what's visible to the user. BTW, I'd rename that to insertLogbookEntry to be more verbose.

Also, this is breaking app currently if used, so I'll remove the usage of local directory from Orgzly for now, so it doesn't confuse people.

@jethrokuan
Copy link
Contributor Author

One thing we'll need is a method for adding the latest log entry to OrgHead. They are stored in reverse order. I think that's what addLog should be, as it's what's visible to the user. BTW, I'd rename that to insertLogbookEntry to be more verbose.

Is addLog being used anywhere else atm? I was thinking that the parser would support parsing logbook entries, but on orgzly's side there would be no visible difference as of yet, not until everything else was ready.

Also, this is breaking app currently if used

How so? Are there integration tests in orgzly for this that I can look at?

@nevenz
Copy link
Member

nevenz commented Dec 13, 2017

Is addLog being used anywhere else atm? I was thinking that the parser would support parsing logbook entries, but on orgzly's side there would be no visible difference as of yet, not until everything else was ready.

No, addLog isn't used anywhere. I was thinking about the library interface in general. I think we want to make it complete from the start. Looking at OrgHead, now it's:

public void initLogbook();
public List<LogbookEntry> getLogbook();
public boolean hasLogbook();
public void addLog(LogbookEntry log);

As a user, I'd be wondering:

  1. What is initLogbook() and what do I do with it?
  2. Are entries that getLogbook() returns in chronological order?
  3. Does hasLogbook() return true even for an empty drawer?
  4. Does addLog() adds entry to the end or the beginning of the drawer?

Random thoughts/suggestions:

  1. I think this needs to be private.
  2. Not right now, they are added as they are parsed, which is reverse chronological order. This needs to be clear and probably changed.
  3. I think so now, right? initLogbook() is called when :LOGBOOK: is found. I think user cares more if drawer contain entries. If it doesn't, :LOGBOOK: doesn't even need to be created.
  4. Entries are added to the end of the drawer (in Org file context), which user will never use.

Anyway, let me know what you think.

How so? Are there integration tests in orgzly for this that I can look at?

Anything that's using a database and has notes with LOGBOOK - two tests in BookParsingTest are failing.

Orgzly keeps OrgHead in database, field by field. content is one of those. And this PR removes LOGBOOK from the content.

This is expected and fine, it's why I bumped org-java version to 1.2.

But Orgzly can't use the new version before it either starts storing logbook to a new field (or set of fields) in database or includes it in the content when saving OrgHead to database.

This is what I meant by:

After that, just storing complete clock entry in the content would be good enough for the first version (using improved OrgHead to parse, update and generate content).

Second pass would be keeping logbook entries and clock separately from the content, for easier/faster access, displaying them in the app, folding :LOGBOOK: etc.

in orgzly/orgzly-android#15.

It's probably easier to just prepend the content for now.

@jethrokuan
Copy link
Contributor Author

What is initLogbook() and what do I do with it?

Yes, I don't think this needs to be exposed.

Are entries that getLogbook() returns in chronological order?

We could probably use a linked list, with a pointer to the end? Or maybe just maintain consistency with how other drawers are currently being implemented.

I think as an internal representation having them in chronological order is alright.

Does hasLogbook() return true even for an empty drawer?

If the latter meaning is wanted (check if has entries), then the function should be renamed to hasLogbookEntries().

Does addLog() adds entry to the end or the beginning of the drawer?

Should be beginning, just like how org does it I believe.

It's probably easier to just prepend the content for now.

Yes, that's fine by me too, for now.

@jethrokuan
Copy link
Contributor Author

jethrokuan commented Jan 5, 2018

@nevenz i'm not sure about the latest changes I made. There are tests to be written, but I want to be sure that it's okay before I proceed.

  1. I made Logbook its own class. This results in some changes to the API, but is actually quite minimal.
  2. I'm using a linked list for logbook because efficient add to front is needed to build the list from file. this is different for all other drawers.

Because of this we need to rethink the API for logbooks in OrgHead.

AFAIK, the only thing we need in there is whether the logbook exists. However, this entails initLogbook() being public, and called when the parser finds a logbook drawer.

@nevenz
Copy link
Member

nevenz commented Jan 5, 2018

1 and 2 sounds good to me.

AFAIK, the only thing we need in there is whether the logbook exists. However, this entails initLogbook() being public, and called when the parser finds a logbook drawer.

I suggest adding "addLogbookEntry" to OrgHead and do a lazy init of the Logbook there.

For parser it doesn't matter that much, but user (app) shouldn't have to init + get + add.

@jethrokuan
Copy link
Contributor Author

jethrokuan commented Jan 9, 2018

@nevenz is this ok?

One thing to consider is how the internal representation -> text logic is not within the class itself. It's hard to move it in, considering it's not entirely dependent on the class itself, having additional properties like indentation.

@nevenz
Copy link
Member

nevenz commented Jan 9, 2018

One thing to consider is how the internal representation -> text logic is not within the class itself. It's hard to move it in, considering it's not entirely dependent on the class itself, having additional properties like indentation.

There could be toString for Logbook with no indentation. Indentation only matters when exporting a notebook to an Org file. And that's only done Note (and therefore OrgHead) at the time. So I think it's fine. Logbook#toString would be used to write logbook back to content in the app. Or store it in database (which might be actually easier to implement).

I'll check what needs to be done on Orgzly side for this to be used.

@jethrokuan
Copy link
Contributor Author

@nevenz any updates on how we could make this work?

@nevenz
Copy link
Member

nevenz commented Jan 15, 2018

@nevenz any updates on how we could make this work?

I'm updating the app to target Oreo, I'll get back to this as soon as that's done.

@jethrokuan
Copy link
Contributor Author

sure 😄 no hurry

@nevenz
Copy link
Member

nevenz commented Jan 20, 2018

I was looking to merge this, but I don't think this will work the way we planned. 😭

LOGBOOK can be in the middle of the content, which is perfectly valid (notes are added to it, state changes are recorded, etc.).

Which means content must be left alone. We can't extract LOGBOOK from it, because we won't know where to put it back. Users might not want it at the beginning of the content.

This is different from planning times or properties, which do have their own special place in the note.

But it's similar to plain timestamps (#7) for example. So we can do it the same way and have:

class OrgContent {
    private StringBuilder value;

    private List<OrgRange> timestamps;
    private boolean dirtyTimestamps;

    private Logbook logbook;
    private boolean dirtyLogbook;
}

Does that make sense?

@jethrokuan
Copy link
Contributor Author

The issue makes sense, and is indeed a blocker for merging. I fail to see how this helps us figure out how to convert this OrgContent back to text, maintaining the order of content.

I think I'll wait for #7 to settle, before making further changes.

@keithcrone
Copy link

I'm just thinking out loud when I ask...
What was the goal once they were parsed? Were you going to start logging state changes? Or clocking?

@jethrokuan
Copy link
Contributor Author

@keithcrone The end goal was clocking, which involves logging state changes.

@keithcrone
Copy link

keithcrone commented Jan 21, 2018 via email

@jethrokuan
Copy link
Contributor Author

@keithcrone thanks for the input. I think having sibling content entries within OrgContent is ideal for the parser, kind of like how Logbook is a list of LogbookEntrys right now. I'm not sure how Orgzly works underneath atm, so can't really form an opinion about that.

This parser is primarily a per-line parser. We could do a one line look-ahead, as I'm doing with multi-line logbook entries right now, but that felt a little out-of-place. Do you have a better idea of how to achieve the same result cleanly?

@keithcrone
Copy link

What are you going to do with the LOGBOOK data that you need to parse all of it?
Are you planning to do more than simple clock in / clock out of a task or record notes and state changes? Wouldn:t you just insert the next line (or multi line entry) underneath :LOGBOOK:, as the next entry?

@jethrokuan
Copy link
Contributor Author

Parsing all of LOGBOOK is not an issue; there's a finite number of permutations, and anything else can fall back to just plain text.

My concerns are with how a logbook drawer can fit into the larger picture. There can be multiple logbook drawers under an org heading. Does a logbook drawer only belong to the content right above it? If so, should we encode this information in the internal representation? If I had a logbook drawer in the middle of a piece of content, when I perform an org-clock-in would a new drawer be created?

All these must be designed to be completely aligned to how Org works in Emacs.

@keithcrone
Copy link

Right, which I guess is more my point.
It seems like you're interested in this one specific case: CLOCKing into a LOGBOOK. If you're trying to do more like create clock tables or other reports or even keep a running total of time spent on task, that might be outside of the scope of Orgzly at the moment.
Multiple LOGBOOK drawers seems like a misconfiguration on the part of org-mode and also outside the scope of Orgzly.
If ypu're talking about multiple drawers in general, then you have to start checking for stuff like '#+PROPERTY: Log_into_drawer Clock' or even per heading properties, which seems outside the scope of Orgzly at the moment.
What if I have #+STARTUP: nologdone in one file and not another? You can't even create a global setting for logging state changes.

With all that said, it doesn't seem like you need to parse the entire LOGBOOK if you're only interested in inserting your CLOCK times into here:

:LOGBOOK:
                                        <-Here
CLOCK: [...
CLOCK: [...
:END:

Which is why I asked if you were going to do anything with all the entries in LOGBOOK. It seems like a lot of work and uncertainty to parse and rebuild the drawer when there are a dizzying number of ways this could go wrong.

Again, I'm just thinking out loud, so don't let me dissuade you if you have a plan in mind. I'm picturing my use, which is someone who doesn't use CLOCK but takes a lot of notes in multiple drawers that aren't named LOGBOOK.

@jethrokuan
Copy link
Contributor Author

Again, I'm just thinking out loud, so don't let me dissuade you if you have a plan in mind

I think this is a healthy level of discussion, I'm taking all your points into consideration.

I'm picturing my use, which is someone who doesn't use CLOCK but takes a lot of notes in multiple drawers that aren't named LOGBOOK.

I'd like to make sure my work is extensible to this use case as well.

If you're trying to do more like create clock tables or other reports or even keep a running total of time spent on task, that might be outside of the scope of Orgzly at the moment.

Yes, I agree, this would be outside the scope of my PR as well. All I'm trying to do is ensure that LOGBOOK has an appropriate internal representation, and can be converted accurately back to its textual representation.

Multiple LOGBOOK drawers seems like a misconfiguration on the part of org-mode and also outside the scope of Orgzly.

Is this true? It seems to me that even LOGBOOK drawers can be arbitrarily created, and some people may like to use drawers to log certain information about paragraphs above. Misconfiguration or not, we need to make sure we don't destroy their files in the process of using Orgzly.

If ypu're talking about multiple drawers in general, then you have to start checking for stuff like '#+PROPERTY: Log_into_drawer Clock' or even per heading properties, which seems outside the scope of Orgzly at the moment.

Hmm, this actually isn't too hard, we can add this in when the need arises I guess.

With all that said, it doesn't seem like you need to parse the entire LOGBOOK if you're only interested in inserting your CLOCK times into here

I hear you... but...

It seems like a lot of work and uncertainty to parse and rebuild the drawer when there are a dizzying number of ways this could go wrong.

Like I mentioned above, I don't think this would really be an issue. Everything should just work now, and support for new kinds of logbook entries should be simple, by subclassing LogbookEntry! The uncertainty I have is not with parsing the LOGBOOK drawer itself, but where LOGBOOK should be. From what I understand now, the hierarchy would look something like this:

OrgHead -> List<OrgContent>

Where OrgContent could be anything from a Paragraph (array of lines) or a Drawer.

@keithcrone
Copy link

I'm sorry for sounding repetitive, but you still aren't really answering my question.

When I'm using Orgzly and I need to take a note in one of my drawers, I use a text expand type app. I open the heading, edit the text, insert a blank line under the drawer name, and after a couple of keys, the text expands into "-Note was taken on ..." with full date and time and slashes at the end. Cursor positions automatically to the next line and I start banging away at my note. The point is: I don't need to know anything else about the rest of the content of the drawer. I'm in complete control of the information placement, in the drawer of my choice, with literally just a key press.

Unless you're planning on actually doing something with the information inside a drawer, you can create the simplest solution with just a few buttons in all the empty space on the note edit screen.
Clock in - insert timestamp
Clock out - insert timestamp
Clock out / mark DONE - change the note state and log the time information
You can even do stuff like Create LOGBOOK and Boom, instant LOGBOOK drawer wherever the user wants.
User hits Save and the entirety of the text is still in the same place, and treated like text like it always has been, no parsing or rebuilding necessary.

@jethrokuan
Copy link
Contributor Author

Sorry if I didn't make myself clear -- let me address your concerns directly.

There are 2 broad considerations:

  1. How to identify LOGBOOK drawer, and where to place it in the internal representation.
  2. How to manipulate/represent LOGBOOK drawer.

You are saying that there is no need to parse the logbook if my intentions were to simply add a new CLOCK entry into the LOGBOOK. You are absolutely right. It is correct to say I only need to identify where the drawer starts, and insert the clock entry at the top of the drawer. Hence, (2) is not necessary. In fact, prior to this PR, Logbook was represented as an array of lines, precisely how you expect it to be.

What I've been trying to get at is that (2) is already implemented in this PR. This all works out very well, and would've been merged in.

What is blocking this PR is (1), with the issues that @nevenz has raised.

@keithcrone
Copy link

Sounds like the time for healthy discussion may have passed. Good luck, I hope you figure it out.

@jethrokuan
Copy link
Contributor Author

@keithcrone Thanks, sorry if i offended you in any way, that was never my intention.

@nevenz
Copy link
Member

nevenz commented Jan 24, 2018

I think we just keep the content as it is, no extracting anything from it and no rebuilding.

@keithcrone's idea about breaking content to separate pieces by type is interesting and could be used as kind of optimization inside OrgContent perhaps. But it would be too complicated to store those separately in DB by Orgzly.

Clocking in/out, adding state changes etc. are all methods that can be in OrgHead and would operate on OrgContent.

OrgContent could have Logbook, or just offsets pointing to it. But that would be just optimization too, and those could be discarded anytime.

Folding drawer is another thing too keep in mind. But that's mostly Orgzly's job. It will need to be able to get the location of the drawer from this library.

This PR could be saved by removing most code from parser and writer I guess. And modifying methods for adding entries to deal with OrgContent (assuming you're fine with the above, different suggestions welcome of course).

@jethrokuan
Copy link
Contributor Author

Sorry for the inactivity, I'm afraid I have to push this back quite a bit, I don't have the time to work on this at the moment.

@nevenz
Copy link
Member

nevenz commented Feb 7, 2018

Sorry for the inactivity, I'm afraid I have to push this back quite a bit, I don't have the time to work on this at the moment.

That's fine, thanks for all the work so far. There's a lot of good points in this discussion too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants