Skip to content

Commit

Permalink
Fix client_info bug, update docstrings and timeouts. (#6421)
Browse files Browse the repository at this point in the history
  • Loading branch information
dpebot authored and tseaver committed Nov 6, 2018
1 parent c566550 commit 3964ad2
Show file tree
Hide file tree
Showing 8 changed files with 100 additions and 88 deletions.
65 changes: 33 additions & 32 deletions speech/google/cloud/speech_v1/gapic/enums.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,51 +25,51 @@ class AudioEncoding(enum.IntEnum):
All encodings support only 1 channel (mono) audio.
For best results, the audio source should be captured and transmitted using
a lossless encoding (``FLAC`` or ``LINEAR16``). The accuracy of the speech
recognition can be reduced if lossy codecs are used to capture or transmit
audio, particularly if background noise is present. Lossy codecs include
``MULAW``, ``AMR``, ``AMR_WB``, ``OGG_OPUS``, and ``SPEEX_WITH_HEADER_BYTE``.
For best results, the audio source should be captured and transmitted
using a lossless encoding (``FLAC`` or ``LINEAR16``). The accuracy of
the speech recognition can be reduced if lossy codecs are used to
capture or transmit audio, particularly if background noise is present.
Lossy codecs include ``MULAW``, ``AMR``, ``AMR_WB``, ``OGG_OPUS``, and
``SPEEX_WITH_HEADER_BYTE``.
The ``FLAC`` and ``WAV`` audio file formats include a header that describes the
included audio content. You can request recognition for ``WAV`` files that
contain either ``LINEAR16`` or ``MULAW`` encoded audio.
If you send ``FLAC`` or ``WAV`` audio file format in
your request, you do not need to specify an ``AudioEncoding``; the audio
encoding format is determined from the file header. If you specify
an ``AudioEncoding`` when you send send ``FLAC`` or ``WAV`` audio, the
The ``FLAC`` and ``WAV`` audio file formats include a header that
describes the included audio content. You can request recognition for
``WAV`` files that contain either ``LINEAR16`` or ``MULAW`` encoded
audio. If you send ``FLAC`` or ``WAV`` audio file format in your
request, you do not need to specify an ``AudioEncoding``; the audio
encoding format is determined from the file header. If you specify an
``AudioEncoding`` when you send send ``FLAC`` or ``WAV`` audio, the
encoding configuration must match the encoding described in the audio
header; otherwise the request returns an
``google.rpc.Code.INVALID_ARGUMENT`` error code.
Attributes:
ENCODING_UNSPECIFIED (int): Not specified.
LINEAR16 (int): Uncompressed 16-bit signed little-endian samples (Linear PCM).
FLAC (int): ``FLAC`` (Free Lossless Audio
Codec) is the recommended encoding because it is
lossless--therefore recognition is not compromised--and
requires only about half the bandwidth of ``LINEAR16``. ``FLAC`` stream
encoding supports 16-bit and 24-bit samples, however, not all fields in
FLAC (int): ``FLAC`` (Free Lossless Audio Codec) is the recommended encoding because
it is lossless--therefore recognition is not compromised--and requires
only about half the bandwidth of ``LINEAR16``. ``FLAC`` stream encoding
supports 16-bit and 24-bit samples, however, not all fields in
``STREAMINFO`` are supported.
MULAW (int): 8-bit samples that compand 14-bit audio samples using G.711 PCMU/mu-law.
AMR (int): Adaptive Multi-Rate Narrowband codec. ``sample_rate_hertz`` must be 8000.
AMR (int): Adaptive Multi-Rate Narrowband codec. ``sample_rate_hertz`` must be
8000.
AMR_WB (int): Adaptive Multi-Rate Wideband codec. ``sample_rate_hertz`` must be 16000.
OGG_OPUS (int): Opus encoded audio frames in Ogg container
(`OggOpus <https://wiki.xiph.org/OggOpus>`_).
``sample_rate_hertz`` must be one of 8000, 12000, 16000, 24000, or 48000.
(`OggOpus <https://wiki.xiph.org/OggOpus>`__). ``sample_rate_hertz``
must be one of 8000, 12000, 16000, 24000, or 48000.
SPEEX_WITH_HEADER_BYTE (int): Although the use of lossy encodings is not recommended, if a very low
bitrate encoding is required, ``OGG_OPUS`` is highly preferred over
Speex encoding. The `Speex <https://speex.org/>`_ encoding supported by
Speex encoding. The `Speex <https://speex.org/>`__ encoding supported by
Cloud Speech API has a header byte in each block, as in MIME type
``audio/x-speex-with-header-byte``.
It is a variant of the RTP Speex encoding defined in
`RFC 5574 <https://tools.ietf.org/html/rfc5574>`_.
``audio/x-speex-with-header-byte``. It is a variant of the RTP Speex
encoding defined in `RFC 5574 <https://tools.ietf.org/html/rfc5574>`__.
The stream is a sequence of blocks, one block per RTP packet. Each block
starts with a byte containing the length of the block, in bytes, followed
by one or more frames of Speex data, padded to an integral number of
bytes (octets) as specified in RFC 5574. In other words, each RTP header
is replaced with a single byte containing the block length. Only Speex
wideband is supported. ``sample_rate_hertz`` must be 16000.
starts with a byte containing the length of the block, in bytes,
followed by one or more frames of Speex data, padded to an integral
number of bytes (octets) as specified in RFC 5574. In other words, each
RTP header is replaced with a single byte containing the block length.
Only Speex wideband is supported. ``sample_rate_hertz`` must be 16000.
"""
ENCODING_UNSPECIFIED = 0
LINEAR16 = 1
Expand All @@ -92,9 +92,10 @@ class SpeechEventType(enum.IntEnum):
speech utterance and expects no additional speech. Therefore, the server
will not process additional audio (although it may subsequently return
additional results). The client should stop sending additional audio
data, half-close the gRPC connection, and wait for any additional results
until the server closes the gRPC connection. This event is only sent if
``single_utterance`` was set to ``true``, and is not used otherwise.
data, half-close the gRPC connection, and wait for any additional
results until the server closes the gRPC connection. This event is only
sent if ``single_utterance`` was set to ``true``, and is not used
otherwise.
"""
SPEECH_EVENT_UNSPECIFIED = 0
END_OF_SINGLE_UTTERANCE = 1
15 changes: 10 additions & 5 deletions speech/google/cloud/speech_v1/gapic/speech_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,9 +138,10 @@ def __init__(self,
)

if client_info is None:
client_info = (
google.api_core.gapic_v1.client_info.DEFAULT_CLIENT_INFO)
client_info.gapic_version = _GAPIC_LIBRARY_VERSION
client_info = google.api_core.gapic_v1.client_info.ClientInfo(
gapic_version=_GAPIC_LIBRARY_VERSION, )
else:
client_info.gapic_version = _GAPIC_LIBRARY_VERSION
self._client_info = client_info

# Parse out the default settings for retry and timeout for each RPC
Expand Down Expand Up @@ -185,9 +186,11 @@ def recognize(self,
Args:
config (Union[dict, ~google.cloud.speech_v1.types.RecognitionConfig]): *Required* Provides information to the recognizer that specifies how to
process the request.
If a dict is provided, it must be of the same form as the protobuf
message :class:`~google.cloud.speech_v1.types.RecognitionConfig`
audio (Union[dict, ~google.cloud.speech_v1.types.RecognitionAudio]): *Required* The audio data to be recognized.
If a dict is provided, it must be of the same form as the protobuf
message :class:`~google.cloud.speech_v1.types.RecognitionAudio`
retry (Optional[google.api_core.retry.Retry]): A retry object used
Expand Down Expand Up @@ -235,8 +238,8 @@ def long_running_recognize(self,
"""
Performs asynchronous speech recognition: receive results via the
google.longrunning.Operations interface. Returns either an
``Operation.error`` or an ``Operation.response`` which contains
a ``LongRunningRecognizeResponse`` message.
``Operation.error`` or an ``Operation.response`` which contains a
``LongRunningRecognizeResponse`` message.
Example:
>>> from google.cloud import speech_v1
Expand Down Expand Up @@ -265,9 +268,11 @@ def long_running_recognize(self,
Args:
config (Union[dict, ~google.cloud.speech_v1.types.RecognitionConfig]): *Required* Provides information to the recognizer that specifies how to
process the request.
If a dict is provided, it must be of the same form as the protobuf
message :class:`~google.cloud.speech_v1.types.RecognitionConfig`
audio (Union[dict, ~google.cloud.speech_v1.types.RecognitionAudio]): *Required* The audio data to be recognized.
If a dict is provided, it must be of the same form as the protobuf
message :class:`~google.cloud.speech_v1.types.RecognitionAudio`
retry (Optional[google.api_core.retry.Retry]): A retry object used
Expand Down
6 changes: 3 additions & 3 deletions speech/google/cloud/speech_v1/gapic/speech_client_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,17 @@
},
"methods": {
"Recognize": {
"timeout_millis": 1000000,
"timeout_millis": 200000,
"retry_codes_name": "idempotent",
"retry_params_name": "default"
},
"LongRunningRecognize": {
"timeout_millis": 60000,
"timeout_millis": 200000,
"retry_codes_name": "non_idempotent",
"retry_params_name": "default"
},
"StreamingRecognize": {
"timeout_millis": 1000000,
"timeout_millis": 200000,
"retry_codes_name": "idempotent",
"retry_params_name": "default"
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,8 @@ def long_running_recognize(self):
Performs asynchronous speech recognition: receive results via the
google.longrunning.Operations interface. Returns either an
``Operation.error`` or an ``Operation.response`` which contains
a ``LongRunningRecognizeResponse`` message.
``Operation.error`` or an ``Operation.response`` which contains a
``LongRunningRecognizeResponse`` message.
Returns:
Callable: A callable which accepts the appropriate
Expand Down
73 changes: 37 additions & 36 deletions speech/google/cloud/speech_v1p1beta1/gapic/enums.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,51 +25,51 @@ class AudioEncoding(enum.IntEnum):
All encodings support only 1 channel (mono) audio.
For best results, the audio source should be captured and transmitted using
a lossless encoding (``FLAC`` or ``LINEAR16``). The accuracy of the speech
recognition can be reduced if lossy codecs are used to capture or transmit
audio, particularly if background noise is present. Lossy codecs include
``MULAW``, ``AMR``, ``AMR_WB``, ``OGG_OPUS``, and ``SPEEX_WITH_HEADER_BYTE``.
The ``FLAC`` and ``WAV`` audio file formats include a header that describes the
included audio content. You can request recognition for ``WAV`` files that
contain either ``LINEAR16`` or ``MULAW`` encoded audio.
If you send ``FLAC`` or ``WAV`` audio file format in
your request, you do not need to specify an ``AudioEncoding``; the audio
encoding format is determined from the file header. If you specify
an ``AudioEncoding`` when you send send ``FLAC`` or ``WAV`` audio, the
For best results, the audio source should be captured and transmitted
using a lossless encoding (``FLAC`` or ``LINEAR16``). The accuracy of
the speech recognition can be reduced if lossy codecs are used to
capture or transmit audio, particularly if background noise is present.
Lossy codecs include ``MULAW``, ``AMR``, ``AMR_WB``, ``OGG_OPUS``, and
``SPEEX_WITH_HEADER_BYTE``.
The ``FLAC`` and ``WAV`` audio file formats include a header that
describes the included audio content. You can request recognition for
``WAV`` files that contain either ``LINEAR16`` or ``MULAW`` encoded
audio. If you send ``FLAC`` or ``WAV`` audio file format in your
request, you do not need to specify an ``AudioEncoding``; the audio
encoding format is determined from the file header. If you specify an
``AudioEncoding`` when you send send ``FLAC`` or ``WAV`` audio, the
encoding configuration must match the encoding described in the audio
header; otherwise the request returns an
``google.rpc.Code.INVALID_ARGUMENT`` error code.
Attributes:
ENCODING_UNSPECIFIED (int): Not specified.
LINEAR16 (int): Uncompressed 16-bit signed little-endian samples (Linear PCM).
FLAC (int): ``FLAC`` (Free Lossless Audio
Codec) is the recommended encoding because it is
lossless--therefore recognition is not compromised--and
requires only about half the bandwidth of ``LINEAR16``. ``FLAC`` stream
encoding supports 16-bit and 24-bit samples, however, not all fields in
FLAC (int): ``FLAC`` (Free Lossless Audio Codec) is the recommended encoding because
it is lossless--therefore recognition is not compromised--and requires
only about half the bandwidth of ``LINEAR16``. ``FLAC`` stream encoding
supports 16-bit and 24-bit samples, however, not all fields in
``STREAMINFO`` are supported.
MULAW (int): 8-bit samples that compand 14-bit audio samples using G.711 PCMU/mu-law.
AMR (int): Adaptive Multi-Rate Narrowband codec. ``sample_rate_hertz`` must be 8000.
AMR (int): Adaptive Multi-Rate Narrowband codec. ``sample_rate_hertz`` must be
8000.
AMR_WB (int): Adaptive Multi-Rate Wideband codec. ``sample_rate_hertz`` must be 16000.
OGG_OPUS (int): Opus encoded audio frames in Ogg container
(`OggOpus <https://wiki.xiph.org/OggOpus>`_).
``sample_rate_hertz`` must be one of 8000, 12000, 16000, 24000, or 48000.
(`OggOpus <https://wiki.xiph.org/OggOpus>`__). ``sample_rate_hertz``
must be one of 8000, 12000, 16000, 24000, or 48000.
SPEEX_WITH_HEADER_BYTE (int): Although the use of lossy encodings is not recommended, if a very low
bitrate encoding is required, ``OGG_OPUS`` is highly preferred over
Speex encoding. The `Speex <https://speex.org/>`_ encoding supported by
Speex encoding. The `Speex <https://speex.org/>`__ encoding supported by
Cloud Speech API has a header byte in each block, as in MIME type
``audio/x-speex-with-header-byte``.
It is a variant of the RTP Speex encoding defined in
`RFC 5574 <https://tools.ietf.org/html/rfc5574>`_.
``audio/x-speex-with-header-byte``. It is a variant of the RTP Speex
encoding defined in `RFC 5574 <https://tools.ietf.org/html/rfc5574>`__.
The stream is a sequence of blocks, one block per RTP packet. Each block
starts with a byte containing the length of the block, in bytes, followed
by one or more frames of Speex data, padded to an integral number of
bytes (octets) as specified in RFC 5574. In other words, each RTP header
is replaced with a single byte containing the block length. Only Speex
wideband is supported. ``sample_rate_hertz`` must be 16000.
starts with a byte containing the length of the block, in bytes,
followed by one or more frames of Speex data, padded to an integral
number of bytes (octets) as specified in RFC 5574. In other words, each
RTP header is replaced with a single byte containing the block length.
Only Speex wideband is supported. ``sample_rate_hertz`` must be 16000.
"""
ENCODING_UNSPECIFIED = 0
LINEAR16 = 1
Expand All @@ -91,9 +91,9 @@ class InteractionType(enum.IntEnum):
INTERACTION_TYPE_UNSPECIFIED (int): Use case is either unknown or is something other than one of the other
values below.
DISCUSSION (int): Multiple people in a conversation or discussion. For example in a
meeting with two or more people actively participating. Typically
all the primary people speaking would be in the same room (if not,
see PHONE_CALL)
meeting with two or more people actively participating. Typically all
the primary people speaking would be in the same room (if not, see
PHONE\_CALL)
PRESENTATION (int): One or more persons lecturing or presenting to others, mostly
uninterrupted.
PHONE_CALL (int): A phone-call or video-conference in which two or more people, who are
Expand Down Expand Up @@ -178,9 +178,10 @@ class SpeechEventType(enum.IntEnum):
speech utterance and expects no additional speech. Therefore, the server
will not process additional audio (although it may subsequently return
additional results). The client should stop sending additional audio
data, half-close the gRPC connection, and wait for any additional results
until the server closes the gRPC connection. This event is only sent if
``single_utterance`` was set to ``true``, and is not used otherwise.
data, half-close the gRPC connection, and wait for any additional
results until the server closes the gRPC connection. This event is only
sent if ``single_utterance`` was set to ``true``, and is not used
otherwise.
"""
SPEECH_EVENT_UNSPECIFIED = 0
END_OF_SINGLE_UTTERANCE = 1
Loading

0 comments on commit 3964ad2

Please sign in to comment.