-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: initial POC of Jessica's fabric doc generator #2023
docs: initial POC of Jessica's fabric doc generator #2023
Conversation
Hey @mhamilton723 👋! We use semantic commit messages to streamline the release process. Examples of commit messages with semantic prefixes:
To test your commit locally, please follow our guild on building from source. |
update fabric channel
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Codecov Report
@@ Coverage Diff @@
## master #2023 +/- ##
==========================================
- Coverage 87.07% 85.05% -2.03%
==========================================
Files 306 306
Lines 16063 16063
Branches 852 852
==========================================
- Hits 13987 13662 -325
- Misses 2076 2401 +325 |
…23/SynapseML into rebase-fabric-channel
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
tools/docgen/docgen/channels.py
Outdated
html = markdown.markdown(md, extensions=["markdown.extensions.tables", "markdown.extensions.fenced_code"]) | ||
parsed_html = BeautifulSoup(html) | ||
# Download images and place them in media directory while updating their links | ||
parsed_html = self._download_and_replace_images(parsed_html, resources, output_img_dir, os.path.dirname(output_file), None, False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is in both branches of if state ent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are slightly different. The only line that's the same is parsed_html = BeautifulSoup(html)
is this OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the extra formatting steps seem like they would be no-ops on the first branch. We also want to remove useless style and cell output metadata in both branches. In this case they can be safely combined right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Extra style and output metadata are not likely to be in a rst file, but it won't hurt to combine.
tools/docgen/docgen/channels.py
Outdated
# Download images and place them in media directory while updating their links | ||
parsed_html = self._download_and_replace_images(parsed_html, resources, output_img_dir, os.path.dirname(output_file), None, False) | ||
# Remove StatementMeta | ||
for element in parsed_html.find_all(text=re.compile("StatementMeta\(.*?Available\)")): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two can be put in both branches of it statement right, also statement meta check should be pushed upstream to the actual notebooks when possible because we don’t want them there either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for Sempy doc. We do not have that in our notebooks and they have that line in some of their example. Just want to put it there to have a cleaner output for their samples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes makes sense, but this would be a no-op for us (And would be helpful if our docs had those by mistake)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a warning... do you want to it be an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary by GPT-4
The changes in this commit include:
- Adding new dependencies to
environment.yml
for mistletoe, pypandoc, markdownify, and traitlets. - Creating a new README.md file for the doc generating pipeline onboarding for the Fabric channel.
- Updating the
channels.py
file to include a new class calledFabricChannel
with various methods for processing input files, downloading and replacing images, validating metadata, generating metadata headers, reading RST files, converting to markdown links, and more. - Modifying the
core.py
file to update the process method with an index parameter. - Updating the
manifest.yaml
file with new metadata for various notebooks related to FabricChannel. - Updating the
setup.py
file to include new dependencies like pypandoc, markdownify, and traitlets.
These changes are mainly focused on adding support for a new Fabric channel in the doc generating pipeline and updating related files with necessary modifications and dependencies.
Suggestions
No suggestions are needed as the changes in this PR seem to be well implemented and organized.
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
…23/SynapseML into rebase-fabric-channel
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
* docs: initial POC of Jessica's fabric doc generator * update fabric channel * update fabric channel - rst file * update fabric channel * update fabric channel * add readme, resolve conflict * add install requires * update fabric channel * format channel * add back WebsiteChannel * formatting docgen * Update tools/docgen/docgen/core.py * Update tools/docgen/docgen/core.py * fix index issue * raise warning for if statementmeta in notebookcell output --------- Co-authored-by: Jessica Wang <jessiwang@microsoft.com> Co-authored-by: JessicaXYWang <108437381+JessicaXYWang@users.noreply.github.com>
Related Issues/PRs
#xxx
What changes are proposed in this pull request?
Briefly describe the changes included in this Pull Request.
How is this patch tested?
Does this PR change any dependencies?
Does this PR add a new feature? If so, have you added samples on website?
website/docs/documentation
folder.Make sure you choose the correct class
estimators/transformers
and namespace.DocTable
points to correct API link.yarn run start
to make sure the website renders correctly.<!--pytest-codeblocks:cont-->
before each python code blocks to enable auto-tests for python samples.WebsiteSamplesTests
job pass in the pipeline.