Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for line breaks in jupyter #27834

Merged
merged 5 commits into from Apr 23, 2019
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
41 changes: 36 additions & 5 deletions tensorflow/tools/compatibility/ipynb.py
Expand Up @@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# =============================================================================
"""A module to support operation on ipynb files"""
"""A module to support operations on ipynb files"""

from __future__ import absolute_import
from __future__ import division
Expand Down Expand Up @@ -62,8 +62,37 @@ def process_file(in_filename, out_filename, upgrader):
return files_processed, report_text, errors


def skip_magic(code_line, magic_list):
"""
Checks if the cell has magic, that is not python-based.

>>> skip_magic('!ls -laF', ['%', '!', '?'])
True
"""

for magic in magic_list:
if code_line.startswith(magic):
return True

return False


def check_line_split(code_line):
r"""
Checks if a line was splitted with `\`.

>>> skip_magic("!gcloud ml-engine models create ${MODEL} \\\n")
True
"""

if code_line.endswith('\\\n'):
return True

return False


def _get_code(input_file):
"""Load the ipynb file and return a list of CodeLines."""
"""Loads the ipynb file and returns a list of CodeLines."""

raw_code = []

Expand All @@ -75,15 +104,17 @@ def _get_code(input_file):
if is_python(cell):
cell_lines = cell["source"]

is_line_split = False
for line_idx, code_line in enumerate(cell_lines):

# Sometimes, jupyter has more than python code
# Idea is to comment these lines, for upgrade time
if code_line.startswith("%") or code_line.startswith("!") \
or code_line.startswith("?"):
if skip_magic(code_line, ['%', '!', '?']) or is_line_split:
# Found a special character, need to "encode"
code_line = "###!!!" + code_line
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how this works...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind is just to make it work for cases like

!gcloud ml-engine models create ${MODEL} \
        --regions us-central1

So I just check if the "magic" code line ends with , so I also need to comment the next line, since it's not python 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prepared a couple of test case, that I am planning to add somewhere, so it's testable - > https://colab.research.google.com/drive/1fCcmg8ZcbnR5dnG3KNl96QKN7u0FrQL-#scrollTo=jrxeKRCwogn6

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, would be good to have test cases for this stuff with the code here.

I see the logic to skip the next line, but what does the ###!!! do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a way to encode these sections with unique commenting prefix, so I can strip them after pasta.

###!!! it's just "unique pattern" that is less likely to be used by somebody else. Otherwise I would remove something, that I did not want to remove


is_line_split = check_line_split(code_line)

# Sometimes, people leave \n at the end of cell
# in order to migrate only related things, and make the diff
# the smallest -> here is another hack
Expand All @@ -102,7 +133,7 @@ def _get_code(input_file):


def _update_notebook(original_notebook, original_raw_lines, updated_code_lines):
"""Update notebook, once migration is done."""
"""Updates notebook, once migration is done."""

new_notebook = copy.deepcopy(original_notebook)

Expand Down