-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
dae3b1d
commit ecffaf7
Showing
6 changed files
with
881 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,219 @@ | ||
Methodology | ||
=========== | ||
.. Internal references | ||
.. _faker-file: https://github.com/barseghyanartur/faker-file/ | ||
|
||
But why | ||
------- | ||
Let's start with some hypothetical questions. | ||
|
||
"But why generate testing files dynamically", - you may ask? | ||
|
||
And the answer would be, - "for a number of reasons": | ||
|
||
Because you do need files and managing test files is a pain nobody wants to | ||
have. You create testing files for one use case, then you need to support | ||
another, but you need to modify the original files or make modifications | ||
there. You either duplicate or make changes, then at some point, after a | ||
number of iterations, your test files collection grows so big, you can't | ||
easily find out how some of the test files different one from another or | ||
your test fail, you spend some time to investigate and find out that there | ||
has been a slight modification of one of the files, which made your pipeline | ||
to fail. You fix the error and decide to document your collection (a good | ||
thing anyway). But then your collection grows even more. The burden of | ||
managing both test files, the documentation of the test files and the | ||
test code becomes unbearable. | ||
|
||
Now imagine doing it not for one, but for a number of projects. You want | ||
to be smart and make a collection of files, document it properly and think | ||
you've done a good job, but then you start to realise that you do need to | ||
deviate or add new files to the collection to support new use cases. You | ||
want to be safe and decide to version control it. Your collection grows, | ||
you start ot accept PRs from other devs and go down the rabbit whole of | ||
owning another critical repository. Your documentation grows and so does | ||
the repository size (mostly binary content). Storing such a huge amount of | ||
files becomes a burden. It slows down everyone. | ||
|
||
Not even talking about, that you might not be allowed to store some of the | ||
you're using for testing centrally, because you would then need to run | ||
obfuscation, anonymization to legally address concerns of privacy regulations. | ||
|
||
When test files are generated dynamically | ||
----------------------------------------- | ||
When test files are generated dynamically, you are relieved from most of the | ||
concerns mentioned above. There are a couple of drawbacks here too, such as | ||
tests execution time (because generating of the test files on the fly does | ||
require some computation resources and therefore - your CI execution time will | ||
grow). | ||
|
||
Best practices | ||
-------------- | ||
In some very specific use-cases, mimicking original files might be too | ||
difficult and you might want to still consider including some of the very | ||
specific and hard-to-recreate files in the project repository, but on much | ||
lower scale. Use `faker-file`_ for simple use cases and only use custom | ||
files when things get too complicated otherwise. The so-called hybrid | ||
approach. | ||
|
||
A couple of use-cases when `faker-file`_ can help you out: | ||
|
||
Create a simple DOCX file | ||
~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
Let's imagine we need to generate a DOCX file with text 50 chars long (just | ||
for observability). | ||
|
||
.. code-block:: python | ||
from faker import Faker | ||
from faker_file.providers.docx_file import DocxFileProvider | ||
FAKER = Faker() | ||
FAKER.add_provider(DocxFileProvider) | ||
file = FAKER.docx_file(max_nb_chars=50) | ||
print(file) # Sample value: 'tmp/tmpgdctmfbp.docx' | ||
print(file.data["content"]) # Sample value: 'Learn where receive social.' | ||
print(file.data["filename"]) # Sample value: '/tmp/tmp/tmpgdctmfbp.docx' | ||
Create a more structured DOCX file | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
Imagine, you need a letter sample. It contains | ||
|
||
.. code-block:: python | ||
TEMPLATE = """ | ||
{{date}} {{city}}, {{country}} | ||
Hello {{name}}, | ||
{{text}} | ||
Address: {{address}} | ||
Best regards, | ||
{{name}} | ||
{{address}} | ||
{{phone_number}} | ||
""" | ||
file = FAKER.docx_file(content=TEMPLATE) | ||
print(file) # Sample value: 'tmp/tmpgdctmfbp.docx' | ||
print(file.data["content"]) | ||
# Sample value below: | ||
# 2009-05-14 Pettyberg, Puerto Rico | ||
# Hello Lauren Williams, | ||
# | ||
# Everyone bill I information. Put particularly note language support | ||
# green. Game free family probably case day vote. | ||
# Commercial especially game heart. | ||
# | ||
# Address: 19017 Jennifer Drives | ||
# Jamesbury, MI 39121 | ||
# | ||
# Best regards, | ||
# | ||
# Robin Jones | ||
# 4650 Paul Extensions | ||
# Port Johnside, VI 78151 | ||
# 001-704-255-3093 | ||
Create even more structured DOCX file | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
Imagine, you need to generate a highly custom document with types of data, | ||
such as images, tables, manual page breaks, paragraphs, etc. | ||
|
||
.. code-block:: python | ||
# Additional imports | ||
from faker_file.base import DynamicTemplate | ||
from faker_file.contrib.docx_file import ( | ||
add_page_break, | ||
add_paragraph, | ||
add_picture, | ||
add_table, | ||
) | ||
# Create a DOCX file with paragraph, picture, table and manual page breaks | ||
# in between the mentioned elements. The ``DynamicTemplate`` simply | ||
# accepts a list of callables (such as ``add_paragraph``, | ||
# ``add_page_break``) and dictionary to be later on fed to the callables | ||
# as keyword arguments for customising the default values. | ||
file = FAKER.docx_file( | ||
content=DynamicTemplate( | ||
[ | ||
(add_paragraph, {}), # Add paragraph | ||
(add_page_break, {}), # Add page break | ||
(add_picture, {}), # Add picture | ||
(add_page_break, {}), # Add page break | ||
(add_table, {}), # Add table | ||
(add_page_break, {}), # Add page break | ||
] | ||
) | ||
) | ||
.. note:: | ||
|
||
All callables do accept arguments. You could provide ``content=TEMPLATE`` | ||
argument to the ``add_paragraph`` function and instead of just random text, | ||
you would get a more structured paragraph (from one of previous examples). | ||
|
||
For when you think `faker-file`_ isn't enough | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
As previously mentioned, sometimes when test documents are too complex it | ||
might be hard to replicate them and you want to store just a few very specific | ||
documents in the project repository. | ||
|
||
`faker-file`_ comes up with a couple of providers that might still help you | ||
in that case. | ||
|
||
Both `FileFromPathProvider`_ and `RandomFileFromDirProvider`_ are created to | ||
support the hybrid approach. | ||
|
||
FileFromPathProvider | ||
^^^^^^^^^^^^^^^^^^^^ | ||
Create a file by copying it from the given path. | ||
|
||
- Create an exact copy of a file under a different name. | ||
- Prefix of the destination file would be ``zzz``. | ||
- ``path`` is the absolute path to the file to copy. | ||
|
||
.. code-block:: python | ||
from faker import Faker | ||
from faker_file.providers.file_from_path import FileFromPathProvider | ||
FAKER = Faker() | ||
FAKER.add_provider(FileFromPathProvider) | ||
file = FAKER.file_from_path( | ||
path="/path/to/file.docx", | ||
prefix="zzz", | ||
) | ||
Now you don't have to copy-paste your file from one place to another. | ||
It will be done for you in a convenient way. | ||
|
||
RandomFileFromDirProvider | ||
^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
Create a file by copying it randomly from the given directory. | ||
|
||
- Create an exact copy of the randomly picked file under a different name. | ||
- Prefix of the destination file would be ``zzz``. | ||
- ``source_dir_path`` is the absolute path to the directory to pick files from. | ||
|
||
.. code-block:: python | ||
from faker_file.providers.random_file_from_dir import ( | ||
RandomFileFromDirProvider, | ||
) | ||
file = RandomFileFromDirProvider(FAKER).random_file_from_dir( | ||
source_dir_path="/tmp/tmp/", | ||
prefix="zzz", | ||
) | ||
Now you don't have to copy-paste your file from one place to another. | ||
It will be done for you in a convenient way. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.